{
 "cells": [
  {
   "cell_type": "markdown",
   "id": "1949a695",
   "metadata": {},
   "source": [
    "# Day 6：合并、分组与小型流程项目（串联前五天）\n",
    "\n",
    "> 今日目标：掌握表合并 (merge / concat) 与分组聚合 (groupby + agg)，完成一个迷你分析闭环：提出问题 → 处理 → 结果图表 → 简短结论。\n",
    "\n",
    "学习路径：\n",
    "1. 准备数据集介绍\n",
    "2. 合并 merge 基本用法 (one-to-one / many-to-one)\n",
    "3. concat 纵向拼接\n",
    "4. groupby 分组聚合与多指标 agg\n",
    "5. 新增衍生指标 (人均/差值)\n",
    "6. Mini Project 任务说明\n",
    "7. 项目示例骨架 (可直接改)\n",
    "8. 小结\n",
    "9. 课后作业要求"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "a4e16615",
   "metadata": {},
   "source": [
    "## 1. 准备数据集\n",
    "我们将使用：\n",
    "- `air_quality_timeseries.csv` (Day4/5 见过)\n",
    "- `city_info.csv` ：包含城市人口 (百万) / 行政区域 (region) / 面积 (km²)\n",
    "目标：合并后计算 *人均 PM2.5* / 区域对比 / 排名。"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "id": "b5869c7c",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "application/vnd.microsoft.datawrangler.viewer.v0+json": {
       "columns": [
        {
         "name": "index",
         "rawType": "int64",
         "type": "integer"
        },
        {
         "name": "date",
         "rawType": "datetime64[ns]",
         "type": "datetime"
        },
        {
         "name": "city",
         "rawType": "object",
         "type": "string"
        },
        {
         "name": "province",
         "rawType": "object",
         "type": "string"
        },
        {
         "name": "PM25",
         "rawType": "int64",
         "type": "integer"
        },
        {
         "name": "PM10",
         "rawType": "int64",
         "type": "integer"
        },
        {
         "name": "NO2",
         "rawType": "int64",
         "type": "integer"
        },
        {
         "name": "SO2",
         "rawType": "int64",
         "type": "integer"
        }
       ],
       "ref": "be801a1a-2c74-4a70-9b1a-78e691361d46",
       "rows": [
        [
         "0",
         "2025-09-01 00:00:00",
         "广州",
         "广东",
         "42",
         "55",
         "19",
         "7"
        ],
        [
         "1",
         "2025-09-02 00:00:00",
         "广州",
         "广东",
         "41",
         "54",
         "18",
         "7"
        ],
        [
         "2",
         "2025-09-03 00:00:00",
         "广州",
         "广东",
         "39",
         "50",
         "17",
         "6"
        ],
        [
         "3",
         "2025-09-04 00:00:00",
         "广州",
         "广东",
         "45",
         "57",
         "20",
         "7"
        ],
        [
         "4",
         "2025-09-05 00:00:00",
         "广州",
         "广东",
         "44",
         "56",
         "19",
         "7"
        ]
       ],
       "shape": {
        "columns": 7,
        "rows": 5
       }
      },
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>date</th>\n",
       "      <th>city</th>\n",
       "      <th>province</th>\n",
       "      <th>PM25</th>\n",
       "      <th>PM10</th>\n",
       "      <th>NO2</th>\n",
       "      <th>SO2</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>2025-09-01</td>\n",
       "      <td>广州</td>\n",
       "      <td>广东</td>\n",
       "      <td>42</td>\n",
       "      <td>55</td>\n",
       "      <td>19</td>\n",
       "      <td>7</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>2025-09-02</td>\n",
       "      <td>广州</td>\n",
       "      <td>广东</td>\n",
       "      <td>41</td>\n",
       "      <td>54</td>\n",
       "      <td>18</td>\n",
       "      <td>7</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>2025-09-03</td>\n",
       "      <td>广州</td>\n",
       "      <td>广东</td>\n",
       "      <td>39</td>\n",
       "      <td>50</td>\n",
       "      <td>17</td>\n",
       "      <td>6</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>2025-09-04</td>\n",
       "      <td>广州</td>\n",
       "      <td>广东</td>\n",
       "      <td>45</td>\n",
       "      <td>57</td>\n",
       "      <td>20</td>\n",
       "      <td>7</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>2025-09-05</td>\n",
       "      <td>广州</td>\n",
       "      <td>广东</td>\n",
       "      <td>44</td>\n",
       "      <td>56</td>\n",
       "      <td>19</td>\n",
       "      <td>7</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "        date city province  PM25  PM10  NO2  SO2\n",
       "0 2025-09-01   广州       广东    42    55   19    7\n",
       "1 2025-09-02   广州       广东    41    54   18    7\n",
       "2 2025-09-03   广州       广东    39    50   17    6\n",
       "3 2025-09-04   广州       广东    45    57   20    7\n",
       "4 2025-09-05   广州       广东    44    56   19    7"
      ]
     },
     "execution_count": 1,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "import pandas as pd\n",
    "import matplotlib.pyplot as plt\n",
    "\n",
    "air = pd.read_csv('../data/air_quality_timeseries.csv', parse_dates=['date'])\n",
    "info = pd.read_csv('../data/city_info.csv')\n",
    "air.head()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "id": "d46d50b3",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "application/vnd.microsoft.datawrangler.viewer.v0+json": {
       "columns": [
        {
         "name": "index",
         "rawType": "int64",
         "type": "integer"
        },
        {
         "name": "city",
         "rawType": "object",
         "type": "string"
        },
        {
         "name": "province",
         "rawType": "object",
         "type": "string"
        },
        {
         "name": "region",
         "rawType": "object",
         "type": "string"
        },
        {
         "name": "population_million",
         "rawType": "float64",
         "type": "float"
        },
        {
         "name": "area_km2",
         "rawType": "int64",
         "type": "integer"
        }
       ],
       "ref": "6a567d82-29b7-48c1-a646-a52ee6042b23",
       "rows": [
        [
         "0",
         "广州",
         "广东",
         "华南",
         "18.7",
         "7434"
        ],
        [
         "1",
         "深圳",
         "广东",
         "华南",
         "17.9",
         "1997"
        ],
        [
         "2",
         "佛山",
         "广东",
         "华南",
         "9.7",
         "3848"
        ],
        [
         "3",
         "北京",
         "北京",
         "华北",
         "21.9",
         "16411"
        ],
        [
         "4",
         "上海",
         "上海",
         "华东",
         "25.0",
         "6341"
        ]
       ],
       "shape": {
        "columns": 5,
        "rows": 5
       }
      },
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>city</th>\n",
       "      <th>province</th>\n",
       "      <th>region</th>\n",
       "      <th>population_million</th>\n",
       "      <th>area_km2</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>广州</td>\n",
       "      <td>广东</td>\n",
       "      <td>华南</td>\n",
       "      <td>18.7</td>\n",
       "      <td>7434</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>深圳</td>\n",
       "      <td>广东</td>\n",
       "      <td>华南</td>\n",
       "      <td>17.9</td>\n",
       "      <td>1997</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>佛山</td>\n",
       "      <td>广东</td>\n",
       "      <td>华南</td>\n",
       "      <td>9.7</td>\n",
       "      <td>3848</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>北京</td>\n",
       "      <td>北京</td>\n",
       "      <td>华北</td>\n",
       "      <td>21.9</td>\n",
       "      <td>16411</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>上海</td>\n",
       "      <td>上海</td>\n",
       "      <td>华东</td>\n",
       "      <td>25.0</td>\n",
       "      <td>6341</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "  city province region  population_million  area_km2\n",
       "0   广州       广东     华南                18.7      7434\n",
       "1   深圳       广东     华南                17.9      1997\n",
       "2   佛山       广东     华南                 9.7      3848\n",
       "3   北京       北京     华北                21.9     16411\n",
       "4   上海       上海     华东                25.0      6341"
      ]
     },
     "execution_count": 2,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "info.head()"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "02663185",
   "metadata": {},
   "source": [
    "## 2. merge 合并基础\n",
    "常见：`pd.merge(left, right, on='key', how='inner')`。\n",
    "how 取值：inner / left / right / outer。今天主要用 inner 或 left。\n",
    "键：城市 city。"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "id": "96004e42",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "application/vnd.microsoft.datawrangler.viewer.v0+json": {
       "columns": [
        {
         "name": "index",
         "rawType": "int64",
         "type": "integer"
        },
        {
         "name": "date",
         "rawType": "datetime64[ns]",
         "type": "datetime"
        },
        {
         "name": "city",
         "rawType": "object",
         "type": "string"
        },
        {
         "name": "province_x",
         "rawType": "object",
         "type": "string"
        },
        {
         "name": "PM25",
         "rawType": "int64",
         "type": "integer"
        },
        {
         "name": "PM10",
         "rawType": "int64",
         "type": "integer"
        },
        {
         "name": "NO2",
         "rawType": "int64",
         "type": "integer"
        },
        {
         "name": "SO2",
         "rawType": "int64",
         "type": "integer"
        },
        {
         "name": "province_y",
         "rawType": "object",
         "type": "string"
        },
        {
         "name": "region",
         "rawType": "object",
         "type": "string"
        },
        {
         "name": "population_million",
         "rawType": "float64",
         "type": "float"
        },
        {
         "name": "area_km2",
         "rawType": "int64",
         "type": "integer"
        }
       ],
       "ref": "a934b4fa-dbbd-4787-a278-1faae8900c88",
       "rows": [
        [
         "0",
         "2025-09-01 00:00:00",
         "广州",
         "广东",
         "42",
         "55",
         "19",
         "7",
         "广东",
         "华南",
         "18.7",
         "7434"
        ],
        [
         "1",
         "2025-09-02 00:00:00",
         "广州",
         "广东",
         "41",
         "54",
         "18",
         "7",
         "广东",
         "华南",
         "18.7",
         "7434"
        ],
        [
         "2",
         "2025-09-03 00:00:00",
         "广州",
         "广东",
         "39",
         "50",
         "17",
         "6",
         "广东",
         "华南",
         "18.7",
         "7434"
        ],
        [
         "3",
         "2025-09-04 00:00:00",
         "广州",
         "广东",
         "45",
         "57",
         "20",
         "7",
         "广东",
         "华南",
         "18.7",
         "7434"
        ],
        [
         "4",
         "2025-09-05 00:00:00",
         "广州",
         "广东",
         "44",
         "56",
         "19",
         "7",
         "广东",
         "华南",
         "18.7",
         "7434"
        ]
       ],
       "shape": {
        "columns": 11,
        "rows": 5
       }
      },
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>date</th>\n",
       "      <th>city</th>\n",
       "      <th>province_x</th>\n",
       "      <th>PM25</th>\n",
       "      <th>PM10</th>\n",
       "      <th>NO2</th>\n",
       "      <th>SO2</th>\n",
       "      <th>province_y</th>\n",
       "      <th>region</th>\n",
       "      <th>population_million</th>\n",
       "      <th>area_km2</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>2025-09-01</td>\n",
       "      <td>广州</td>\n",
       "      <td>广东</td>\n",
       "      <td>42</td>\n",
       "      <td>55</td>\n",
       "      <td>19</td>\n",
       "      <td>7</td>\n",
       "      <td>广东</td>\n",
       "      <td>华南</td>\n",
       "      <td>18.7</td>\n",
       "      <td>7434</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>2025-09-02</td>\n",
       "      <td>广州</td>\n",
       "      <td>广东</td>\n",
       "      <td>41</td>\n",
       "      <td>54</td>\n",
       "      <td>18</td>\n",
       "      <td>7</td>\n",
       "      <td>广东</td>\n",
       "      <td>华南</td>\n",
       "      <td>18.7</td>\n",
       "      <td>7434</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>2025-09-03</td>\n",
       "      <td>广州</td>\n",
       "      <td>广东</td>\n",
       "      <td>39</td>\n",
       "      <td>50</td>\n",
       "      <td>17</td>\n",
       "      <td>6</td>\n",
       "      <td>广东</td>\n",
       "      <td>华南</td>\n",
       "      <td>18.7</td>\n",
       "      <td>7434</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>2025-09-04</td>\n",
       "      <td>广州</td>\n",
       "      <td>广东</td>\n",
       "      <td>45</td>\n",
       "      <td>57</td>\n",
       "      <td>20</td>\n",
       "      <td>7</td>\n",
       "      <td>广东</td>\n",
       "      <td>华南</td>\n",
       "      <td>18.7</td>\n",
       "      <td>7434</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>2025-09-05</td>\n",
       "      <td>广州</td>\n",
       "      <td>广东</td>\n",
       "      <td>44</td>\n",
       "      <td>56</td>\n",
       "      <td>19</td>\n",
       "      <td>7</td>\n",
       "      <td>广东</td>\n",
       "      <td>华南</td>\n",
       "      <td>18.7</td>\n",
       "      <td>7434</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "        date city province_x  PM25  PM10  NO2  SO2 province_y region  \\\n",
       "0 2025-09-01   广州         广东    42    55   19    7         广东     华南   \n",
       "1 2025-09-02   广州         广东    41    54   18    7         广东     华南   \n",
       "2 2025-09-03   广州         广东    39    50   17    6         广东     华南   \n",
       "3 2025-09-04   广州         广东    45    57   20    7         广东     华南   \n",
       "4 2025-09-05   广州         广东    44    56   19    7         广东     华南   \n",
       "\n",
       "   population_million  area_km2  \n",
       "0                18.7      7434  \n",
       "1                18.7      7434  \n",
       "2                18.7      7434  \n",
       "3                18.7      7434  \n",
       "4                18.7      7434  "
      ]
     },
     "execution_count": 3,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "merged = pd.merge(air, info, on='city', how='left')\n",
    "merged.head()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "id": "f9d26ceb",
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "合并后形状: (30, 11)\n",
      "列名: ['date', 'city', 'province_x', 'PM25', 'PM10', 'NO2', 'SO2', 'province_y', 'region', 'population_million', 'area_km2']\n",
      "是否存在缺失:\n",
      "date                  0\n",
      "city                  0\n",
      "province_x            0\n",
      "PM25                  0\n",
      "PM10                  0\n",
      "NO2                   0\n",
      "SO2                   0\n",
      "province_y            0\n",
      "region                0\n",
      "population_million    0\n",
      "area_km2              0\n",
      "dtype: int64\n"
     ]
    }
   ],
   "source": [
    "print('合并后形状:', merged.shape)\n",
    "print('列名:', merged.columns.tolist())\n",
    "print('是否存在缺失:')\n",
    "print(merged.isnull().sum())"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "177f2a03",
   "metadata": {},
   "source": [
    "### 小练习 1\n",
    "1. 如果想只保留在 city_info 中出现的城市，应使用哪种 how？\n",
    "2. 如果想保留全部城市 + 标记未匹配项？"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "9efedc91",
   "metadata": {},
   "source": [
    "## 3. concat 纵向拼接\n",
    "场景：多文件同结构追加。例如将前 3 天与后 2 天拆分再合并。"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 5,
   "id": "1fdadd75",
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "是否一样: True\n"
     ]
    },
    {
     "data": {
      "application/vnd.microsoft.datawrangler.viewer.v0+json": {
       "columns": [
        {
         "name": "index",
         "rawType": "int64",
         "type": "integer"
        },
        {
         "name": "date",
         "rawType": "datetime64[ns]",
         "type": "datetime"
        },
        {
         "name": "city",
         "rawType": "object",
         "type": "string"
        },
        {
         "name": "province",
         "rawType": "object",
         "type": "string"
        },
        {
         "name": "PM25",
         "rawType": "int64",
         "type": "integer"
        },
        {
         "name": "PM10",
         "rawType": "int64",
         "type": "integer"
        },
        {
         "name": "NO2",
         "rawType": "int64",
         "type": "integer"
        },
        {
         "name": "SO2",
         "rawType": "int64",
         "type": "integer"
        }
       ],
       "ref": "953f37f7-db27-47dd-93b1-fb63f52cc2af",
       "rows": [
        [
         "0",
         "2025-09-01 00:00:00",
         "广州",
         "广东",
         "42",
         "55",
         "19",
         "7"
        ],
        [
         "1",
         "2025-09-02 00:00:00",
         "广州",
         "广东",
         "41",
         "54",
         "18",
         "7"
        ],
        [
         "2",
         "2025-09-03 00:00:00",
         "广州",
         "广东",
         "39",
         "50",
         "17",
         "6"
        ],
        [
         "5",
         "2025-09-01 00:00:00",
         "北京",
         "北京",
         "60",
         "80",
         "35",
         "12"
        ],
        [
         "6",
         "2025-09-02 00:00:00",
         "北京",
         "北京",
         "65",
         "85",
         "37",
         "13"
        ]
       ],
       "shape": {
        "columns": 7,
        "rows": 5
       }
      },
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>date</th>\n",
       "      <th>city</th>\n",
       "      <th>province</th>\n",
       "      <th>PM25</th>\n",
       "      <th>PM10</th>\n",
       "      <th>NO2</th>\n",
       "      <th>SO2</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>2025-09-01</td>\n",
       "      <td>广州</td>\n",
       "      <td>广东</td>\n",
       "      <td>42</td>\n",
       "      <td>55</td>\n",
       "      <td>19</td>\n",
       "      <td>7</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>2025-09-02</td>\n",
       "      <td>广州</td>\n",
       "      <td>广东</td>\n",
       "      <td>41</td>\n",
       "      <td>54</td>\n",
       "      <td>18</td>\n",
       "      <td>7</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>2025-09-03</td>\n",
       "      <td>广州</td>\n",
       "      <td>广东</td>\n",
       "      <td>39</td>\n",
       "      <td>50</td>\n",
       "      <td>17</td>\n",
       "      <td>6</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>5</th>\n",
       "      <td>2025-09-01</td>\n",
       "      <td>北京</td>\n",
       "      <td>北京</td>\n",
       "      <td>60</td>\n",
       "      <td>80</td>\n",
       "      <td>35</td>\n",
       "      <td>12</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>6</th>\n",
       "      <td>2025-09-02</td>\n",
       "      <td>北京</td>\n",
       "      <td>北京</td>\n",
       "      <td>65</td>\n",
       "      <td>85</td>\n",
       "      <td>37</td>\n",
       "      <td>13</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "        date city province  PM25  PM10  NO2  SO2\n",
       "0 2025-09-01   广州       广东    42    55   19    7\n",
       "1 2025-09-02   广州       广东    41    54   18    7\n",
       "2 2025-09-03   广州       广东    39    50   17    6\n",
       "5 2025-09-01   北京       北京    60    80   35   12\n",
       "6 2025-09-02   北京       北京    65    85   37   13"
      ]
     },
     "execution_count": 5,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "part1 = air[air['date'] <= air['date'].sort_values().unique()[2]]\n",
    "part2 = air[air['date'] > air['date'].sort_values().unique()[2]]\n",
    "recombined = pd.concat([part1, part2], axis=0)\n",
    "print('是否一样:', recombined.shape == air.shape)\n",
    "recombined.head()"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "49c8d8fa",
   "metadata": {},
   "source": [
    "### 小练习 2\n",
    "1. 模拟一个缺少某列的 DataFrame 再 concat，观察列对齐。\n",
    "2. axis=1 会发生什么？"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "a296f7c1",
   "metadata": {},
   "source": [
    "## 4. groupby 分组聚合与多指标 agg\n",
    "示例：按城市统计 PM2.5 平均值 & 最大值；按区域统计 PM2.5/PM10 平均。"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 6,
   "id": "ef3c1280",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "application/vnd.microsoft.datawrangler.viewer.v0+json": {
       "columns": [
        {
         "name": "city",
         "rawType": "object",
         "type": "string"
        },
        {
         "name": "mean",
         "rawType": "float64",
         "type": "float"
        },
        {
         "name": "max",
         "rawType": "int64",
         "type": "integer"
        },
        {
         "name": "min",
         "rawType": "int64",
         "type": "integer"
        }
       ],
       "ref": "e53115bb-24a6-490f-90a3-4f12a8bf2a18",
       "rows": [
        [
         "北京",
         "62.2",
         "66",
         "58"
        ],
        [
         "成都",
         "59.2",
         "62",
         "57"
        ],
        [
         "重庆",
         "55.4",
         "57",
         "54"
        ],
        [
         "上海",
         "45.0",
         "47",
         "43"
        ],
        [
         "广州",
         "42.2",
         "45",
         "39"
        ]
       ],
       "shape": {
        "columns": 3,
        "rows": 5
       }
      },
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>mean</th>\n",
       "      <th>max</th>\n",
       "      <th>min</th>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>city</th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>北京</th>\n",
       "      <td>62.2</td>\n",
       "      <td>66</td>\n",
       "      <td>58</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>成都</th>\n",
       "      <td>59.2</td>\n",
       "      <td>62</td>\n",
       "      <td>57</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>重庆</th>\n",
       "      <td>55.4</td>\n",
       "      <td>57</td>\n",
       "      <td>54</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>上海</th>\n",
       "      <td>45.0</td>\n",
       "      <td>47</td>\n",
       "      <td>43</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>广州</th>\n",
       "      <td>42.2</td>\n",
       "      <td>45</td>\n",
       "      <td>39</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "      mean  max  min\n",
       "city                \n",
       "北京    62.2   66   58\n",
       "成都    59.2   62   57\n",
       "重庆    55.4   57   54\n",
       "上海    45.0   47   43\n",
       "广州    42.2   45   39"
      ]
     },
     "execution_count": 6,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# 每城市 PM2.5 平均与最大\n",
    "city_pm25_stats = merged.groupby('city')['PM25'].agg(['mean','max','min']).sort_values('mean', ascending=False)\n",
    "city_pm25_stats.head()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 7,
   "id": "9f8cdb5a",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "application/vnd.microsoft.datawrangler.viewer.v0+json": {
       "columns": [
        {
         "name": "region",
         "rawType": "object",
         "type": "string"
        },
        {
         "name": "PM25",
         "rawType": "float64",
         "type": "float"
        },
        {
         "name": "PM10",
         "rawType": "float64",
         "type": "float"
        }
       ],
       "ref": "6cdc163b-a1b3-4c68-9ffd-e61fad3098a9",
       "rows": [
        [
         "华北",
         "62.2",
         "82.2"
        ],
        [
         "西南",
         "57.3",
         "74.3"
        ],
        [
         "华南",
         "42.2",
         "54.4"
        ],
        [
         "华东",
         "42.0",
         "56.0"
        ]
       ],
       "shape": {
        "columns": 2,
        "rows": 4
       }
      },
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>PM25</th>\n",
       "      <th>PM10</th>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>region</th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>华北</th>\n",
       "      <td>62.2</td>\n",
       "      <td>82.2</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>西南</th>\n",
       "      <td>57.3</td>\n",
       "      <td>74.3</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>华南</th>\n",
       "      <td>42.2</td>\n",
       "      <td>54.4</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>华东</th>\n",
       "      <td>42.0</td>\n",
       "      <td>56.0</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "        PM25  PM10\n",
       "region            \n",
       "华北      62.2  82.2\n",
       "西南      57.3  74.3\n",
       "华南      42.2  54.4\n",
       "华东      42.0  56.0"
      ]
     },
     "execution_count": 7,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# 按区域 region 计算 PM2.5 & PM10 平均\n",
    "region_stats = merged.groupby('region')[['PM25','PM10']].mean().sort_values('PM25', ascending=False)\n",
    "region_stats"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "c95cc47c",
   "metadata": {},
   "source": [
    "### 多列不同聚合写法\n",
    "`agg({'PM25':['mean','max'], 'PM10':'mean'})`"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 8,
   "id": "3fa89e32",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "application/vnd.microsoft.datawrangler.viewer.v0+json": {
       "columns": [
        {
         "name": "region",
         "rawType": "object",
         "type": "string"
        },
        {
         "name": "('PM25', 'mean')",
         "rawType": "float64",
         "type": "float"
        },
        {
         "name": "('PM25', 'max')",
         "rawType": "int64",
         "type": "integer"
        },
        {
         "name": "('PM10', 'mean')",
         "rawType": "float64",
         "type": "float"
        }
       ],
       "ref": "cef348f6-d2d8-4ab8-8e98-09a6e16aaa45",
       "rows": [
        [
         "华东",
         "42.0",
         "47",
         "56.0"
        ],
        [
         "华北",
         "62.2",
         "66",
         "82.2"
        ],
        [
         "华南",
         "42.2",
         "45",
         "54.4"
        ],
        [
         "西南",
         "57.3",
         "62",
         "74.3"
        ]
       ],
       "shape": {
        "columns": 3,
        "rows": 4
       }
      },
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead tr th {\n",
       "        text-align: left;\n",
       "    }\n",
       "\n",
       "    .dataframe thead tr:last-of-type th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr>\n",
       "      <th></th>\n",
       "      <th colspan=\"2\" halign=\"left\">PM25</th>\n",
       "      <th>PM10</th>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th></th>\n",
       "      <th>mean</th>\n",
       "      <th>max</th>\n",
       "      <th>mean</th>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>region</th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>华东</th>\n",
       "      <td>42.0</td>\n",
       "      <td>47</td>\n",
       "      <td>56.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>华北</th>\n",
       "      <td>62.2</td>\n",
       "      <td>66</td>\n",
       "      <td>82.2</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>华南</th>\n",
       "      <td>42.2</td>\n",
       "      <td>45</td>\n",
       "      <td>54.4</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>西南</th>\n",
       "      <td>57.3</td>\n",
       "      <td>62</td>\n",
       "      <td>74.3</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "        PM25      PM10\n",
       "        mean max  mean\n",
       "region                \n",
       "华东      42.0  47  56.0\n",
       "华北      62.2  66  82.2\n",
       "华南      42.2  45  54.4\n",
       "西南      57.3  62  74.3"
      ]
     },
     "execution_count": 8,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "multi_agg = merged.groupby('region').agg({'PM25':['mean','max'], 'PM10':'mean'})\n",
    "multi_agg"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "b306fdc5",
   "metadata": {},
   "source": [
    "### 小练习 3\n",
    "1. 计算各 city 的 NO2 平均与 SO2 平均。\n",
    "2. 找出平均 PM25 最低的 3 个城市。"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "1a5035fc",
   "metadata": {},
   "source": [
    "## 5. 新增衍生指标 (人均 & 比值)\n",
    "示例：人均 PM2.5 = PM25 / population_million (注意只是示例，并非真实科学算法)。\n",
    "还可构造：PM10 与 PM25 之差 diff = PM10 - PM25。"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 9,
   "id": "7ae91ac0",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "application/vnd.microsoft.datawrangler.viewer.v0+json": {
       "columns": [
        {
         "name": "index",
         "rawType": "int64",
         "type": "integer"
        },
        {
         "name": "date",
         "rawType": "datetime64[ns]",
         "type": "datetime"
        },
        {
         "name": "city",
         "rawType": "object",
         "type": "string"
        },
        {
         "name": "PM25",
         "rawType": "int64",
         "type": "integer"
        },
        {
         "name": "population_million",
         "rawType": "float64",
         "type": "float"
        },
        {
         "name": "PM25_per_million",
         "rawType": "float64",
         "type": "float"
        },
        {
         "name": "PM_diff",
         "rawType": "int64",
         "type": "integer"
        }
       ],
       "ref": "159c4216-f24d-4405-b060-9b4e67e2d11b",
       "rows": [
        [
         "0",
         "2025-09-01 00:00:00",
         "广州",
         "42",
         "18.7",
         "2.2459893048128343",
         "13"
        ],
        [
         "1",
         "2025-09-02 00:00:00",
         "广州",
         "41",
         "18.7",
         "2.192513368983957",
         "13"
        ],
        [
         "2",
         "2025-09-03 00:00:00",
         "广州",
         "39",
         "18.7",
         "2.085561497326203",
         "11"
        ],
        [
         "3",
         "2025-09-04 00:00:00",
         "广州",
         "45",
         "18.7",
         "2.4064171122994655",
         "12"
        ],
        [
         "4",
         "2025-09-05 00:00:00",
         "广州",
         "44",
         "18.7",
         "2.3529411764705883",
         "12"
        ]
       ],
       "shape": {
        "columns": 6,
        "rows": 5
       }
      },
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>date</th>\n",
       "      <th>city</th>\n",
       "      <th>PM25</th>\n",
       "      <th>population_million</th>\n",
       "      <th>PM25_per_million</th>\n",
       "      <th>PM_diff</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>2025-09-01</td>\n",
       "      <td>广州</td>\n",
       "      <td>42</td>\n",
       "      <td>18.7</td>\n",
       "      <td>2.245989</td>\n",
       "      <td>13</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>2025-09-02</td>\n",
       "      <td>广州</td>\n",
       "      <td>41</td>\n",
       "      <td>18.7</td>\n",
       "      <td>2.192513</td>\n",
       "      <td>13</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>2025-09-03</td>\n",
       "      <td>广州</td>\n",
       "      <td>39</td>\n",
       "      <td>18.7</td>\n",
       "      <td>2.085561</td>\n",
       "      <td>11</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>2025-09-04</td>\n",
       "      <td>广州</td>\n",
       "      <td>45</td>\n",
       "      <td>18.7</td>\n",
       "      <td>2.406417</td>\n",
       "      <td>12</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>2025-09-05</td>\n",
       "      <td>广州</td>\n",
       "      <td>44</td>\n",
       "      <td>18.7</td>\n",
       "      <td>2.352941</td>\n",
       "      <td>12</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "        date city  PM25  population_million  PM25_per_million  PM_diff\n",
       "0 2025-09-01   广州    42                18.7          2.245989       13\n",
       "1 2025-09-02   广州    41                18.7          2.192513       13\n",
       "2 2025-09-03   广州    39                18.7          2.085561       11\n",
       "3 2025-09-04   广州    45                18.7          2.406417       12\n",
       "4 2025-09-05   广州    44                18.7          2.352941       12"
      ]
     },
     "execution_count": 9,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "merged['PM25_per_million'] = merged['PM25'] / merged['population_million']\n",
    "merged['PM_diff'] = merged['PM10'] - merged['PM25']\n",
    "merged[['date','city','PM25','population_million','PM25_per_million','PM_diff']].head()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 10,
   "id": "97db18c8",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "application/vnd.microsoft.datawrangler.viewer.v0+json": {
       "columns": [
        {
         "name": "region",
         "rawType": "object",
         "type": "string"
        },
        {
         "name": "PM25_per_million",
         "rawType": "float64",
         "type": "float"
        }
       ],
       "ref": "e7480f21-0b56-4461-81bf-987cff64f00e",
       "rows": [
        [
         "华北",
         "2.8401826484018264"
        ],
        [
         "华东",
         "2.46"
        ],
        [
         "西南",
         "2.2724521584334667"
        ],
        [
         "华南",
         "2.2566844919786098"
        ]
       ],
       "shape": {
        "columns": 1,
        "rows": 4
       }
      },
      "text/plain": [
       "region\n",
       "华北    2.840183\n",
       "华东    2.460000\n",
       "西南    2.272452\n",
       "华南    2.256684\n",
       "Name: PM25_per_million, dtype: float64"
      ]
     },
     "execution_count": 10,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# 区域层面的人均 PM2.5 平均 (再聚合)\n",
    "region_per_capita = merged.groupby('region')['PM25_per_million'].mean().sort_values(ascending=False)\n",
    "region_per_capita"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "9c777761",
   "metadata": {},
   "source": [
    "### 小练习 4\n",
    "1. 新增列 PM_ratio = PM10 / PM25。\n",
    "2. 计算各区域 PM_ratio 平均，排序。"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "b7affef1",
   "metadata": {},
   "source": [
    "## 6. Mini Project 任务说明\n",
    "30~45 分钟内完成：\n",
    "1. 选 2~3 个问题 (例：哪个区域人均 PM2.5 指标最高？城市 PM10-PM25 差值分布怎样？)\n",
    "2. 数据流程：读取 → 合并 → 衍生列 → 分组统计/排序 → 1 张图 (柱状/折线/散点选其一)\n",
    "3. 输出：\n",
    "   - 关键表 1~2 个 (head 或整表简洁显示)\n",
    "   - 图 1 张\n",
    "   - 结论 80~120 字（含客观描述 + 解释 + 局限）\n",
    "4. 伦理：注明数据来源 (课程提供示例)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "6021162a",
   "metadata": {},
   "source": [
    "## 7. 项目示例骨架 (可直接复制修改)\n",
    "下面示例：问题：哪个区域的平均 PM25 与人均 PM25 指标都偏高？"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 11,
   "id": "271df4c3",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "application/vnd.microsoft.datawrangler.viewer.v0+json": {
       "columns": [
        {
         "name": "region",
         "rawType": "object",
         "type": "string"
        },
        {
         "name": "PM25",
         "rawType": "float64",
         "type": "float"
        },
        {
         "name": "PM25_per_million",
         "rawType": "float64",
         "type": "float"
        },
        {
         "name": "PM_diff",
         "rawType": "float64",
         "type": "float"
        }
       ],
       "ref": "ba9cb121-4d65-4c3d-9011-ecf81f2f69fb",
       "rows": [
        [
         "华北",
         "62.2",
         "2.8401826484018264",
         "20.0"
        ],
        [
         "西南",
         "57.3",
         "2.2724521584334667",
         "17.0"
        ],
        [
         "华南",
         "42.2",
         "2.2566844919786098",
         "12.2"
        ],
        [
         "华东",
         "42.0",
         "2.46",
         "14.0"
        ]
       ],
       "shape": {
        "columns": 3,
        "rows": 4
       }
      },
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>PM25</th>\n",
       "      <th>PM25_per_million</th>\n",
       "      <th>PM_diff</th>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>region</th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>华北</th>\n",
       "      <td>62.2</td>\n",
       "      <td>2.840183</td>\n",
       "      <td>20.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>西南</th>\n",
       "      <td>57.3</td>\n",
       "      <td>2.272452</td>\n",
       "      <td>17.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>华南</th>\n",
       "      <td>42.2</td>\n",
       "      <td>2.256684</td>\n",
       "      <td>12.2</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>华东</th>\n",
       "      <td>42.0</td>\n",
       "      <td>2.460000</td>\n",
       "      <td>14.0</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "        PM25  PM25_per_million  PM_diff\n",
       "region                                 \n",
       "华北      62.2          2.840183     20.0\n",
       "西南      57.3          2.272452     17.0\n",
       "华南      42.2          2.256684     12.2\n",
       "华东      42.0          2.460000     14.0"
      ]
     },
     "execution_count": 11,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# (1) 基础汇总\n",
    "region_summary = merged.groupby('region').agg({\n",
    "    'PM25':'mean',\n",
    "    'PM25_per_million':'mean',\n",
    "    'PM_diff':'mean'\n",
    "}).sort_values('PM25', ascending=False)\n",
    "region_summary"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 19,
   "id": "1a094c30",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "image/png": "iVBORw0KGgoAAAANSUhEUgAAAk4AAAGGCAYAAACNCg6xAAAAOXRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjkuMiwgaHR0cHM6Ly9tYXRwbG90bGliLm9yZy8hTgPZAAAACXBIWXMAAA9hAAAPYQGoP6dpAABRYElEQVR4nO3dd1gU1/s28HsBBUQRARFQLIlGBdGgsSsgRmONLXYldk2sUaOixoANY4klGqJfe+xdVDRWsGFBAXsjGlH60ttSdt4/eJmf61IGBFnk/lzXXjJ1n13G3ZszZ87IBEEQQERERET50irpAoiIiIhKCwYnIiIiIokYnIiIiIgkYnAiIiIikojBiYiIiEgiBiciIiIiiRiciIiIiCRicCIiIiKSiMGJ6CNITU0t6RI+uqCgoJIuQU1aWhoyMzMlr3/8+HFER0eL0yEhISrT+cnIyChQfYWRmpqKtLS0Yn8eTVOQ3yNRUWJwojLP29sbUgbQDw8Px7fffovY2Fi1ZWvXrsWqVaty3faHH37A77//jpcvX35IqaVK+/btcfXq1SLZV0REBAYNGoTHjx9L3qZVq1YICAhQmXfmzBl8/fXXkgPN0qVLcf/+fXHax8cH/fv3VzleoqKicg1TP/zwA5YvX57v8/j4+ODChQuSanrf06dPYW1tjfj4eLVlgiAgJiYmx+2GDh2q9v4UlUuXLhV4m/j4eLi5uUn6vwgAJ0+exJQpU/Jd/9GjR7m+B0SFoVPSBRCVtE2bNmHWrFmoUKFCnuspFAqkpKRg165dmDRpksqy+Ph4NGzYMNdtTU1N0bRpU7x9+xarVq3C+vXri6R2TZWQkIB69eqhXbt2ua4TFhaGbdu2wcXFJd/9xcfH4/bt23m+x+9LSUnBl19+qTLv6tWrmDJlCnR0pH306erqomXLluK0vr4+OnXqBJlMBkEQcODAAcyYMQOtWrXCvn371Parq6uLFi1aiNMXLlzA+PHjUbt2bZX13r59C3Nzc3Ts2DHXWoKCgvD555/nWKOdnR0MDQ3VlmVkZKBWrVoICAjAZ599prLs/v37sLa2zvP1F5RSqcTSpUvRr18/lflPnz7FsGHDYGBgkOu2b968QVBQEExMTNT+f6WkpCAzMxMVK1YU5/Xq1Qtr1qxBSkqK+H939+7dOH/+PLZs2QItrax2gQYNGmDFihXo0aMHbGxsiuqlUhnG4ERlXvny5bFz5040aNCg0PuQyWTQ09PLdbmxsTHS0tLg6OiImJgYzJ49GwsWLMjzi6Q0u3nzJp4/fw5HR8dc1/nvv//w6tUr1K5dG4MHD85zfwqFAq9fv0ZaWhrKly8vqQZ9fX21eY8ePcKyZctU5sXGxuLo0aMYOXKk2vrZX76XL1+GpaUlgKxTRHv37sXu3bvRokUL3L59GxYWFvD394ednZ3K9jo6OtDT00NERASqVq2Kjh074sWLF/nW/ttvv+Ho0aMqx9Tdu3exYsUKjB8/Xu05dHV1c9xPuXLl0LBhQ7XQBAAVKlSQ/F5KNXfuXAwfPlwt4NavXx8XL15EpUqVCrXf+Ph49OzZExUqVICPjw9kMhns7e0BAN26dUNMTAyMjIwgk8kAAIGBgeLvQktLC7NmzcJPP/2E6dOno2bNmh/wCokYnIjyFRUVhUmTJmHVqlWoXr16nuuGh4dj2bJluHnzpsqXUnh4OHbv3g1TU1OkpaUhPj4erVq1Qp8+ffJ9/hMnTmDo0KEYPHgwmjdvjsTERPz777+oVq0a5s6di5MnT2LYsGH4+++/8e2334rb1axZEwMGDICbmxsMDAywf/9+hISEQBAEREdHo0KFCvj5559Rrlw5AMD06dNx8eJFleceM2aM2l//Uly/fh0bN25Ez549C7xtTiIjI9GwYUM8e/YMjRo1KtQ+/Pz88O+//8LJyUllfmZmJm7evIlWrVqhYcOGWLBgAS5dugRtbW0EBASgc+fOuHv3Llq3bo2JEyfi+vXraNeuHTw9PcVgBQBXrlyBnZ0dnj17hn379sHQ0BD+/v4AgKNHj2L8+PGYO3eupFonTpyI4cOHi61IGRkZaNasGbp3746kpKQcA7dCocDkyZOxaNEiVKtWTZyfHSbyEx4ejq1bt2LOnDmSt3nXP//8gypVquTaqlPY0AQA1apVw61btwAASUlJaNiwIS5duiTWOWzYMPzwww9o27YtAGDVqlWwsLCAubk5gKz34Ndff8X48eNx4MCBQtdBBAAQiMq477//Xnj8+LEgCILg7e0tKJVKleW//PKLYGdnJ0RERKht6+npKWzevFn47rvvhMGDBwt16tQRgoKCxH1ERkYKSqVSOHnypLBjx45C19i+fXshNDRUZd6iRYuEv/76SxAEQZg8ebIwceJEcdnjx4+FmjVrCiEhIYIgCMIff/whnDhxQmX7Xbt2CTNnzhSnz507V+j63tepUychLi6uyPa3fft2YcGCBcLWrVvzXC8hIUHIyMgQBEEQWrZsKaSlpQnjx48XgoODhUmTJglyuVxtm6SkJMHJyUllH8nJyYIgCIKDg4OQkpIiLjt69Kjg7u6e43OPGDFCEARBUCqVwsuXL4X4+HihXbt2wsWLF4WEhAQhNDRUSE1NzbX2devWCcePH89x2YMHD4SBAwcK3t7eQrNmzYSOHTuKjzZt2gjm5ubi9KFDh1S2bdmyZY77fHd+RkaG0LVrV2HWrFlCdHR0rjXmxcnJSQgPDy/Utu/y8fERDh8+LE7PmzdPaN++veDg4CA+rKysxJ9btGghWFtbi9OtW7cWAAjt27dX2/eYMWOE27dvf3CNVLaxxYnoHXXr1sUvv/yC8ePHw8rKCrGxsfj7779x4cIFVK1aVW397BaV0NBQfPnll9izZw8ePnwIa2trVKtWDc+fP8esWbNgZ2eH58+fAwC2bduGW7du4c8//5T8l/27LRvZ6tSpI572cXJywvz588VlL1++RJ06dcT9nzx5EiNGjFDZfsiQIViyZAlWrFghqQapUlJSACDHPjfvunLlCmrWrIlatWrlu8///vsP7dq1w40bN/Jc78CBA9i4cSP09fXx6NEjdOrUCQCwc+dOWFtbQyaTqZ1SS01NVTmt924/mvfp6OggMjJSbX5wcDAOHz6M1atXw8jISOzD9ObNG+jr66NixYrYvXs3tm3bJu4/Li4OERERqFevnrifhg0biq2GW7duxbBhw1C+fHncvn0b7dq1g4ODA/z8/FSe+8WLF3B1dcWuXbvyfG8A4PDhwzhy5Ah27twpzsvMzMSECRPQvXt3TJw4Md995CQiIgIymQxmZmYq8319fTF16tR8+w++KyAgAElJSXj69Ck+++wzzJ8/H9ra2ihXrhzu3buHxo0b57jdw4cPER8fj9atW+e67379+uHgwYP46quvJNdD9D4GJ6J3VK9eHQsWLMDPP/+Mb775BufPn4ebm1uOfURyo62tjZYtW2L79u2YM2cOmjRpAkNDQ0RFRQHI+jLp3bt3oU6HZHv8+DE2b96Mbdu2Acj6sjcxMcF///2XYxDZuXOn2qmSV69e5djZ+ENt374dQUFBefZvArLCUPXq1SVdeRcYGIgJEybgn3/+yXO9UaNGYdSoUQCyrqrz9vYGkNVnaNy4cdDW1sa6detw+vRp8bSZQqHIs39aRkYGDh48iGrVqqFFixb47bff1Dqdy2QyDB8+HJUrVxbnvXz5EkZGRjhx4gSaNm2K8ePHq/RPunr1Ko4dO4aVK1fm+Ly9evXC7NmzsWLFCvj4+GDGjBlivbn1acrJ06dP1X4X2eHrzZs3mDNnDoYNG4YuXbpI3uf7/P39UbduXbX5rVu3xrlz51Tel2zbt29HamoqJkyYkOe+3/3dDBw4UOU05Lvkcjn09PTg6+uba+f/L774Itf3m0gqBiei95QvXx5r1qzB2LFjcffuXfz+++8fvM+qVasiPDwcQNZf1B4eHgXex/79+2FgYIDk5GSUL18eR44cQZUqVcTlnTp1wtmzZzFy5Ei1Tr/vtwSEhIRg/vz52LRpkzgvMTERa9euRVBQEIKDg9G0aVPMmDGjQK0FADBhwgQ4Ozvn2/F9zZo1uX4JvksQBMjlcpiZmaFu3boIDAxEkyZNJNejUCgwYMAAaGtrAwBcXFxQq1YtTJw4EZUrV0ZqamqOnaSjo6MRFRWF3377DaNGjRJb+K5cuaK27vHjx9GzZ0+VMJzd58fKygodO3bE8ePHYWxsDC8vL3zzzTf51m1iYoKRI0di4sSJePXqldi3y9fXF2/fvsXQoUMlvf769euLAfJdqampOHr0KP788898WwfzExUVBQsLixyXvRuabt26hebNmxf6jwYDA4Ncg4+vry/kcnmeV0xaWloiIiKiUM9NlI3BiSgHycnJSE1NRdu2bTFu3Dhs3Ljxg1qIqlWrhrdv3+LmzZto06aN+CVeEAMHDhQ7u+akc+fOWLFiBRo1aoRmzZrlut7Jkyfh5+eHTZs2qYSbJk2aoFu3bihfvjwyMzOxevVq9O3bF2fOnClQnTKZTNLVguHh4WpXoeXk9u3b4iX9PXr0wIoVK7B27VrJ9ejq6qJOnTridPny5bF7927xC12hUKgEpxs3bsDDwwMJCQkQBAHz5s0TWz3c3d3x8OFDlVaQ9PR0BAQE4Pz58yqniZ49e4bGjRvjyy+/RHBwMK5du4aOHTti9OjROHHihLieIAjYsmULxowZo1Z748aNYWlpKZ7+BABHR0d89913MDMzE09FFoaenh4mT55c6O3fVaVKFSQnJ+e5TmJiIvr27Yvt27fj66+/Vll2+fJlCIIABweHfJ8rtzG4pAyImZKSAiMjo3zXI8oLB8AkyoGbmxuWLVuGdevWISYmBtOmTZO0nUKhQGJiIs6cOQNHR0fs2bMHSqUSOjo6SEhIwLZt2zB69Ohiqfmrr76Cv78/5HJ5jl8O2WHI0NAQrq6uauGmTp06YoDQ1tbGzJkzkZiYKPbNyk96ejoSEhJyXb5v3z64u7uL02FhYWotYTn53//+J57OqVGjBkJCQhAaGiqpJiBrbKF3W4lSUlIQExODRYsW4cyZM1AoFCqtar6+vvj8889x5MgRtX5t+vr6WL9+Pby9vcXHtWvXYGtrqxKaHjx4IF7hBQCurq7o2bMn9u3bhylTpqj0sZHJZKhcuTK2b9+eY/03btzAw4cPVV7D999/n+eAqx9b7dq1ERISkuc6ixcvxoIFC9RCEwDY29sjKChIbaiI9yUlJWHOnDk5PrZu3aoSMHMSFhbG4Qjog7HFieg9f/zxB4YMGYIaNWoAyOqka2Njg549e+b4oQ9kDfj4v//9Dy9evEDr1q0xZMgQ/P777zh79qw4PlT16tUREhICW1vbYqlbS0sLdnZ2OY4TlD0w4YQJE3Ls5P727VukpKSo9VOpXr265FGX09LS0KFDBxgYGOTaOieTyTBq1ChUq1YN4eHh+Z6qe/ToEdLT01X6mE2ePBnjxo1TabXJSVJSEn755Rdoa2vj66+/hoeHB86ePQstLS2MGTMGv/zyC4CsoPRuh/CffvpJ0uvNy9mzZzF9+nQx7JQrVw5KpRKenp45Xg7fv39/tG/fHq1atVIZT+zUqVNwdHTEV199hbFjx+LBgwfQ0dFB69at8w0JhZXbcAd5adiwIYKDgyEIQo6/+9u3byMpKQnjxo3LdR+jRo3CsmXLMGHCBHh4eOS4n379+mHs2LGoVasWgoODoaWlJQ4Rcv/+fbx+/TrPOr29vdGtW7cCvTai97HFiegde/bsQfPmzVU6/1aqVEkcm+l969atQ8OGDZGcnIy//voL06ZNQ4sWLfD7779DoVBAR0cHJ0+eRHJyMpRKpUpLw/HjxyXXJQgClEplrsuzT1N888034mk6QRDE21Fs2bIFw4cPVwlNsbGx4imvK1euqI1m/vr1a7x69UrS6TQgq/+Jn58ffHx8VFpk3n1cunRJDEuxsbEwNjbOdX+JiYmYMWOGWp8We3t7VKpUKcdbmTx48ADTp09HixYt8M0332DcuHFwdXVF3bp1cebMGSxduhSHDx9G165dxW1iYmLyrON948ePh6Ojo8ojLi5OXB4aGppjwN6zZw8mTZqU66CT74/zlJKSgqNHj2LmzJno2LEj9PT0xI70pqamGDBggOSa35WQkKBS7/tOnjxZ4H3KZDL06NEDPj4+asvi4uLg4eEhqa/gnDlzEBkZCTc3N7Vlfn5+WLx4MXr06AEvLy9YWFhg7Nix4q1qYmJi0Lx5c8yfPz/HKx+BrJHb+/btW8BXR6SKLU5E72jQoAGaNm2qNn/w4MFISkpSm9+tWzc0btwYjo6OUCqV2LdvH/z9/fHkyROYmZlhwIABGDt2LLZt24YOHTrg0KFDiI6Oho6ODuRyuaSaTp8+jYCAAPz+++/iYJbvL1+6dCksLS3FfjLZ26xduxa//PILTp06hcOHD4vbpKSk4M6dO9i4cSMAYNCgQdDX18fvv/8OLS0tpKSkICIiAocOHRIHyCxqefUZi42Nxffffw83NzeYmpqqLffw8EDr1q1hYGCgcgm9sbExMjIycP78eZUOz+bm5rkG1devX+fasfl9SqUSGzduVLucvVWrVuLPFhYW4v4yMjIgCAIiIiKQmJiYa4slkNV/a9asWeL03LlzsWDBArE/3PDhwyXXmZPLly/D09MT9erVUxua4l2FvZ/ilClTMGHCBLRu3Vq86i8lJUXskyb1OFq/fj1sbGzg6uoqzktOTsaVK1fw1VdfwcDAQGw1mjZtGs6fP4+OHTvC29sb9vb2GDFiBDp06IDTp0/DyspK3MfBgwfRv3//PK+gJJKkxEaQItIQ7w6AmZ+NGzcK06ZNy3W5i4uL4OHhISQlJYnz9u7dKw5UGRgYKHTv3l1YtWqV2kCFZYmvr68wZMiQHJfduHFDGDJkiPDixYs89/H69WuhZs2agrOzsxAVFVXoWvr16ycEBgaK02lpaeIApm3atFEZtHLs2LFC06ZNVQZjdHBwEOrVq5fjvseOHStcuXJFuHz5strAql5eXsKMGTNU5nl7ewuCkHWc+fr65ll3cnKykJiYKNy5c0ccfPN9GRkZQvXq1QV7e3th8eLFKq/Fzs5OZXBPQRCE0NBQoUGDBmq1SvXmzRvB1dVV3P7Jkycqz5Geni7I5XIhPT1d2Lhxo/j/4n2enp4qr2HAgAHi+9GqVStBoVAIgiAImZmZQkJCgpCZmSk0bNhQiImJEQQha9DMd/ft7+8vrFu3rlCvieh9bHGiMu/Zs2fo169fjn1/cpPbDVeXLl2qMu3h4YFq1aqJ4/c0btwYS5YsQb9+/TB16tQPK7yU8fHxwdWrV3H+/Hk8f/5crY9SdHQ0/vzzT1haWuLvv//OcdDPd1lZWeHq1av45ptvMGrUKGzbtk3SKbeYmBgcPXoUfn5+ePLkCSpXrqwyqOLz589x9OhRbNu2DZUrV1YZMyk1NRWbN29WO32ZUysl8H9XZ77b0vTw4UP89ddf2LdvH2bOnKmyfvZVZUOGDMlzIE4ga0iJPXv2wNXVFVu3bs1xncOHD6Ny5cr4888/1W6F0qdPH7XxnbS0tDBo0KBCX0FavXp1TJgwAVu3bsXo0aNRv359leUpKSk4ffo0tm/fDl9fX1y/fj3H/bx7qx4vLy/UqlVLbNVr1qwZrK2tVd6fxMRElb51P/zwg9hS9/jxYwQFBRXZFYREMkH4/50giMqoK1euoFWrVkV+SiohIQH//fdfjvdWUygUePr0aa6jIH/K/P39YWJionZ1U0JCAvT09Ar8e8jIyMhz7J7cZPch6t+/f47PGRUVBQMDA5VRxUNDQ3M8XRYfH5/jWEjHjx9Hp06dchwLa8OGDRgxYsQH3+j5yJEj6NOnT45hJz09HRkZGTne8Lg4Cbl0En9XZGRkgf5YKSylUplvCCcqCAYnIiIiIokYw4mIiIgkYnAiIiIikojBiYiIiEgiBiciIiIiiRiciIiIiCQqU+M4KZVKhISEoFKlSh90p3siIiL6dAiCgISEBFhaWuY7fEWZCk4hISEqQ/ATERERZQsODhZv8J6bMhWcKlWqBCDrjclpsDoiIiIqe+Lj42FlZSXmhLyUqeCUfXrO0NCQwYmIiIhUSOnGw87hRERERBIxOBERERFJxOBEREREJFGZ6uNERETqMjMzkZ6eXtJlEBWbcuXKQVtbu0j2xeBERFRGCYKAsLAwxMbGlnQpRMXOyMgI5ubmHzyOI4MTEVEZlR2azMzMUKFCBQ4MTJ8kQRCQnJyMiIgIAICFhcUH7Y/BiYioDMrMzBRDk4mJSUmXQ1Ss9PX1AQAREREwMzP7oNN27BxORFQGZfdpqlChQglXQvRxZB/rH9qfj8GJiKgM4+k5KiuK6lhncCIiIiqE5ORkCIKQ5zrr1q2Dr6+v2vwtW7bg/v37KvNGjRqFhISEIq2Rih77OBERERXCpk2bcOjQIejo/N9X6ePHj9GwYUOV9Z4+fYrWrVurzLt+/Tp69uwpTj9//hzlypWTdK80KlkMTkREVCo8fPgQPXr0wJAhQ2BjY4OQkBCEhITA3d0d8+fPh6+vL65evQoAiIyMxBdffIFTp06hTZs28PT0RGRkJNLT0/Hvv/9i/Pjx+PzzzwEAffr0Uen3MnXqVHTq1CnfeqZNm4Zhw4bB1NRUnDds2DDs2rULABAXF4egoCA0bdpUbVuZTKbSv+zAgQOwt7eHn5+fOM/LywtDhgxB3bp1C/hOUXFicCpKewaWdAXFa8j+kq6AiIrZ6O23P+rzbRnRXPK6NjY2sLe3x08//SSGlfPnz2PNmjX48ccf4ePjg8TERFSsWBFPnjxBmzZt0KZNG/zzzz9IS0vD6NGjAQAJCQlwdnbG0aNHAQAzZsxAu3btClV/XFwcDh8+jPHjx6st27FjB6KionIMTtkuXrwIBwcHhIeHo2PHjsjIyBCXPXv2DOHh4QxOGoZ9nIiIqNR4v4OvpaUl5HI5ZDIZ2rZti2vXrgHIGrsne923b9+qBJJKlSpBW1sbCoXig+v5/PPPERUVhQMHDqjMT09Px7Nnz7Bw4UKV+SNGjICDgwNOnTqFr7/+Gr169cLy5csxZcoU2NjYoFWrVuLD2NhY7bQflTwGJyIiKpUUCgVWr14NZ2dnAICjoyO8vb0BQKXT9qhRozBo0CBx+u3bt6hSpQp0dXUBZJ0CXL58OaZOnYoVK1aohCwpBg8eDB8fH5V527Ztw5w5c9TW3bp1K3x8fNC9e3ecP38eYWFhqFmzJurWrYujR4/ir7/+EtdNTU2Fnp5egWqh4sfgREREpcrp06exb98+7NixAy4uLmjUqBEAwM7ODgEBAQgPD0e1atVy3PbRo0dYv3491q1bJ8779ttvMWvWLKxduxaGhoZYunRpvjWkpqZCqVQCAD777DMsX75cZXmHDh1Qo0YNAMDq1atx584dAICWlurXbmZmJgYPHgwAcHZ2xp49e5CcnCw+R3a4I83BPk5ERFSqdO3aVaVDdjYtLS3o6+vDz88P3bt3V1t+6tQpxMXFwd3dXWX+u7fgGD9+PLp165ZvDd7e3nBzc1MLNo8fP4ajo6Pa+i9fvkSzZs3U5hsaGqpM7927F7q6uoiOjkZ6enqR3ZiWig6DExERfTLatm2LGzduqAWngwcP4rPPPlOb/+zZM9SoUUPlCjcpYaVLly7o0qWL2vx3r6rLT1xcHE6ePIn69evD1tYW165dw+XLl2FhYYHvvvtOZZgD0hw8VUdERKWGIAg5DjqZPd/R0RGWlpYAIJ5Ke/ToESpWrKjS4nP69GkAWS08t2//35WEb968gbGxcbHVL5fLsWvXLgQFBWHHjh3o3Lkz/P390aZNG7x8+RI///wzxowZAyMjI47qrqEYZ4mIqFR4/Pgxrly5gk2bNmHevHkqyzZu3AgdHR0sWrQItra22LlzJ65fv46bN2/i/PnzePbsGfbvzxpS5d9//4WNjQ26du0KFxcX7Ny5E//++y/S0tLw4sULrFy5stA15jeS+JYtW6ClpYXTp0+LHb+HDBmCAQMGoGLFiirrZgc/0iwMTkREVCo0bNgQ//77b47Lli1bJv5cvnx5ODs7i1fbtWzZMtd9li9fHmPGjCmyGlNTU/NcPmvWLLV57145FxMTgypVquDGjRuoWrVqkdVFRYfBiYiIRAUZkJLUxcXFFXpbQRBw//597Nq1C48fPxYH6CTNwuBERERURL7//vtCbyuTyWBvbw97e/sirIiKGjuHExERFZHhw4eXdAlUzBiciIiIiCRicCIiIiKSiMGJiIiISCIGJyIiIiKJGJyIiIiIJGJwIiKiUu3q1asIDAzMd71Lly7B09NTZZ6bmxsePHhQXKWViKSkJJw/f76ky8jVnTt3EBQUlO96u3btQmhoaK7Lo6KicPXq1aIsTRIGJyIiKtWOHDmChw8f5rvemzdvVEb2FgQBV69ehbW1dXGW91ElJSVh3bp1cHJyKrEaFAoFHBwc4OjomOOjc+fO6NWrl9p2ycnJKresadKkCTZt2iRO79y5E//88484bWpqCkNDQ+zdu7d4X9B7OAAmERGVamlpaYiPj893PZlMhgoVKojT169fh62tLe7evSvOCwwMhLm5Obp3757nvk6ePIkhQ4ZgxYoV0NfXx4sXL2Bqaophw4Zh2LBhaNKkCdzd3QEAhw4dwrJly3DkyBFYWFhg586dkMlkSE5ORlhYGObOnYsKFSrg1atXGDNmjHgLFplMhk2bNsHCwkLye7FkyRLMnTsXWlr5t4soFAqsWbMG+/fvx/z589G3b1+V5Rs3bkRqairkcjmGDRuGL774QtIyXV1deHl5wcDAQHLdALB582YcPHgQ/v7+MDc3R40aNQBkvdflypVD+fLlYWFhgW+++UbcpnHjxggICICfnx+++uqrAj1fYTE4ERHR/9kz8OM+35D9H7yL6Oho6OrqSl7/+vXraNasGU6ePIkBAwYgIyNDXBYXF4ewsLB8g1OPHj3Qvn17jB8/Xpw3f/58PH/+HLNmzcJvv/0mzn/9+jW6d++OmjVrYu7cuZg+fTpMTU0BAAEBAViwYIF4Y+Ht27eLgaGgXr16BR0dHbWbBedGV1cXs2fPRkpKitoNhS9fvgwtLS1MnToVSqUSY8aMwdatW/Ndlq2goQkApkyZgilTpiAxMRGDBw/GiRMnAADu7u5o3rw5vv76a0RGRmLHjh0qI7QPHjwYEyZMwJYtWwr8nIXB4ERERKXeu6d43rd+/Xrs2bMHMTEx0NHRQWhoKH744QfY2dnBxsYGlSpVEtcNCQmR/JwymUxl2tLSEnK5HIaGhtDX10diYqIYYrLXlcvlSExMFIPTl19+iV9++UXyc+Zl7969aqfAnj17hrdv36JDhw4AAA8PD0yYMEGl9tq1a6vta/PmzfDw8AAAaGlpwc7ODpcvX4a9vX2ey959zXmJjIzEsmXLsHz5cmhra2Pfvn3Q1taGUqmEtrY2evfujUOHDgEAzM3NERISgl27domtXHFxcahcuTIAoFy5cpDJZCrzihODExERlVqCIEBXVxflypVDfHw8DA0N1db58ccfMWnSJOzatQtGRkbo0aMHNm3ahAEDBuDOnTvYs2cPli1bhnLlyiE1NRVGRkYFriM8PBwXLlzAzp074e/vj7Zt2+LatWuws7ODmZmZeCpx48aNKtt5eXmhS5cu4vTZs2cRGhqK//77D127dkWfPn0k1/D48WNMmjRJZd4XX3yBoKAgXL9+HXfv3kWfPn3UAl9OEhMTVVqNvvrqK1y7dk0MR7ktc3Fxwb1799Se49GjRzn2JfPz80PLli0xaNAgAFmnGs+ePau2fXh4OJo2bYrdu3fnWG+dOnXw/Pnzj3K6jsGJiIhKrewv5Pbt2+PUqVMYPHiw2jo59ffp378/AKBZs2Y4evQozp8/j65duyI1NVXyab+EhAQcO3YMycnJSElJwdatW8VA4ejoiEOHDkFPTw8NGjTAixcv1LY/ceIEXr16hcmTJwPIarHq1asXTExMkJGRgdGjR6NKlSpwdHSUVE9sbKxK61m2rl27YurUqWjfvj2qV68uaV/vnr4EABMTE4SFheW77I8//shxf4MGDcK+ffvyfd709HT89ttvqFu3rsr8a9eu5XmFXdWqVcUaihuvqiMiolLL09MTvXr1Qu3atXHjxg3J21WpUkX8eebMmXBwcEBERESBglOlSpXQu3dvDBkyBKNHj1Y5TWRnZ4eAgIAcTyEqlUp4eHjAxMREDE0AUL58eZiYmAAAdHR0sGzZMuzcuVPyazI0NERKSora/L1792LcuHFQKBS4f/++pH3p6Ki2q8jlcpibm+e77F0DBw5UO/Xp6+uLW7du5fncAQEBuHHjhsrj0aNHeW4TFxcnvnfFjS1ORERUKqWlpSEoKEi8oqtevXq4cOECOnbsmOs2GRkZ8PLygkwmQ9euXfHgwQN4eXlBV1cX3bt3R2pqaqE6Nr9PS0sL+vr6SE1NVenzIwgC1qxZA2dnZ7GfUzZ/f3/Y2dmJ06ampoiNjZX8nNbW1nj+/DkaN24sznv+/Dnq1q0LGxsb2NjYYP/+/bCxscn3qruKFSsiKSlJfC/8/PzQokWLfJdlO3jwIBo1agRLS0uV+a1bt8aOHTtw//59jB49Osfn3rp1q8rVj0BWa9qIESNyrffly5cYOXJknq+pqLDFiYiISqXly5dj4sSJ4vSYMWOwePFitbCRmpqK06dP4/jx4/jnn3/w+eefw8LCAra2tvD09MTIkSMxdepU1K1bF3FxcSqtUXl5/0q09+e3bdsWtWrVgiAIYsvTvn37MGjQIDE0KZVKnD17FgAwe/ZsldNgp06dknyaDoDKlWjZ6tWrh+bNm4vTAwcOVAtN79aXbdSoUeKpNaVSCX9/f9jb2+e7DMhqgfL29sb8+fNzrPP7779HUFAQ1q9frzI/NTUVzZs3x5IlS3DixAksXboUhw8fhre3N/bv34/69evnOOyEUqlEZmamWhAtLmxxIiKiUmfHjh0wNjZWaaHR09PD4sWL0bdvX5w4cUJsEbl58yYOHz6M9evXo1q1auL6Fy9eRNWqVVX2+/bt2xxPO73Py8sLFy9exKVLl8Qr1gAgJiYGK1asQEpKCiZNmgSZTIYpU6bgzp07GDt2LE6dOiUO4piZmQk/Pz/8+uuv4mvauHEjDAwMEBcXh/j4eMybN0/ye1KnTh3Ex8cjMjJS7XXl5ty5c9i+fTv09fVRrVo1MQA5Ojri6dOn2LBhA6KiojB79mxxm7yWZWZmYu3atVi2bFmendDd3NzQsmVL9O3bV2yVOn36NLp27YoxY8agevXqsLGxQdeuXcWO7QMHDsSMGTMwcOBAtGrVStzXpk2bMGrUKMnv04eSCXldw/mJiY+PR+XKlREXF5fjlRcf7GOPf/KxFcF4K0SkGVJTU/Hy5UvUqVNHHHARgMaP45Seno5Vq1ahTp06GDgw51p3796N3377DTt27FAJVrnJ7lQdExODkSNHqrXalCbJyclYtWoV5syZg3Llyn3059+9eze6du0KY2Njlfk5dQ4/e/YsWrdujUqVKiE8PBzz5s3D5s2bMWLECHh4eEBfX18MU+PGjcOff/6JpKQkdO7cGTdv3gSQ1R/q/v37GD58eL615XrMo2D5gC1ORERUKgQGBsLHxwcjR45UaTl639ChQ1GuXDnMmTMHGzZsULtC631v377FiRMnsH379gJ1xtZEFSpUwNSpU+Hp6Yl+/fp99OcfOnSo+PObN2/g6emJe/fu5Xg1X+fOncWfPT09xaEU6tevD3t7e7HF0N3dHXFxcUhOTkblypXx559/AgAiIiIQHBwsKTQVJbY4FSW2OBFRKZHXX99ERcnb2xvNmjXLcaiEj+mTaHG6desWHjx4gIyMDLx58wbt27dHp06dsGDBApV7Bzk5OWH69OkAsppUV65cCQsLC8TGxsLFxUXSPXmIiIjo4ytIB/fSoMSCU1RUFI4dO4alS5eK84YNGwZbW1t07twZCxcuzHE7V1dXzJ8/H6amprhz5w42bNigMg4GERERUXEpsaaa8PBwtYG6WrZsmefgXHK5HCkpKeIlh82aNcPly5eLtU4iIiKibCUWnGxsbLB69WpxOj09HTdu3EDr1q0hl8uxbt06zJ49GzNmzEBUVBSArI6Btra2KvsxNjYWlxMRUcGUoW6uVMYV1bGuEVfVxcXFYenSpVi0aBEqVqyIFi1aiHd49vf3Fy8PjYiIULvEMfseOTkNfKVQKKBQKMTpnAbOIiIqi7IvVU9OToa+vn4JV0NU/JKTkwHgg4dpKPHg9ODBA5w7dw5ubm5iL3cLCwtxuZ2dHSwtLfH69WuYmZmp3a8mt3vkAFmXMLq5uRVf8UREpZS2tjaMjIwQEREBIOsy9rwGLCQqrQRBQHJyMiIiImBkZARtbe0P2l+JBqc7d+7g5cuX+Omnn8R50dHRSExMRM2aNcV51apVg1wuR5MmTbB/v+ol8dHR0bkOs+7i4iJejQdktThZWVkV8asgIiqdsv/ozA5PRJ8yIyMjSaPC56fEgpNCoYCvr6844BWQNQLokydP8OzZMyxYsABA1g0ZAwICMG/ePOjq6kJPTw/R0dEwNjZGQEAA2rVrl+tz6OrqSr7LNRFRWSOTyWBhYQEzMzOkp6eXdDlExaZcuXIf3NKUrcSC061bt3Dq1Cn4+fkByDrlFhwcjICAABw7dgxbtmyBIAh48eIFFixYIAYgV1dXrF69GtWrV0dUVBTmzJlTUi+BiOiToK2tXWRfKkSfOo4cXpQ4cjgREZWkT/l7qBi/gwqSDzjkNhEREZFEDE5EREREEjE4EREREUnE4EREREQkEYMTERERkUQMTkREREQSMTgRERERScTgRERERCQRgxMRERGRRAxORERERBIxOBERERFJxOBEREREJBGDExEREZFEDE5EREREEjE4EREREUnE4EREREQkEYMTERERkUQMTkREREQSMTgRERERSaRT0gUQlUl7BpZ0BcVryP6SroCIqFiwxYmIiIhIIrY4ERFpuk+5hZKtk1TKsMWJiIiISCIGJyIiIiKJGJyIiIiIJGJwIiIiIpKIwYmIiIhIIgYnIiIiIokYnIiIiIgkYnAiIiIikojBiYiIiEgiBiciIiIiiRiciIiIiCRicCIiIiKSiMGJiIiISCIGJyIiIiKJGJyIiIiIJGJwIiIiIpKIwYmIiIhIIgYnIiIiIokYnIiIiIgkYnAiIiIikojBiYiIiEginZJ88lu3buHBgwfIyMjAmzdv0L59e3Tq1AmxsbFYuXIlLCwsEBsbCxcXF2hpZWW8vJYRERERFacSC05RUVE4duwYli5dKs4bNmwYbG1tsWzZMsyfPx+mpqa4c+cONmzYgMmTJwMAXF1dc11GREREVJxKrKkmPDwcKSkpKvNatmwJHx8fpKSkwNTUFADQrFkzXL58GQAgl8tzXUZERERU3ArV4vTkyRM8evQIYWFhMDExQcOGDdG4ceMC7cPGxgarV68Wp9PT03Hjxg0MHDgQtra2KusaGxsjKioK9+7dy3VZdpgiIiIiKi4FanHy9PSEs7Mz9u/fj9TUVNSrVw/a2trw8vLCuHHjcOTIkUIVERcXh/nz52PRokVITk6GsbGxynITExOEhYUhIiIi12U5USgUiI+PV3kQERERFZbkFicPDw/Ur18fO3fuzHUdLy8vrF+/HpMmTZJcwIMHD3Du3Dm4ublBT08Pr169wqNHj1TWkcvlMDc3R0RERK7LcuLu7g43NzfJtRARERHlRVKL09mzZ9GjRw84OTnluV63bt3w7bff4sCBA5Ke/M6dO3jy5Al++ukn6OnpAQCaNGmC+/fvq6wXHR0NU1PTPJflxMXFBXFxceIjODhYUl1EREREOZHU4tS+fXvo6+tL2mHNmjVRtWrVfNdTKBTw9fVVaZ0KCAiAubk59PT0EB0dDWNjYwQEBKBdu3YAsk7L5bYsJ7q6utDV1ZVUNxEREVF+JAUnqaGpIOvfunULp06dgp+fH4CsU27BwcEICAiAq6srVq9ejerVqyMqKgpz5swRt8trGREREVFxKrFxnNq3b4/Tp0/nuKxKlSpYuHBhgZcRERERFScOuU1EREQkUYGD07FjxzBr1ixcu3YNo0ePxt9//10cdRERERFpnAIHp6CgICxYsAAnT57Eli1bkJSUVBx1EREREWmcAgenSpUq4ciRI3Bzc8OuXbsQFxdXHHURERERaZwCdw4fN26c+POwYcOKtBgiIiIiTVboq+q8vb1x9epVKJVKAIAgCHj69Cn27NlTZMURERERaZJCB6fVq1dj/vz50NL6v7N99erVK5KiiIiIiDRRoYNT//79kZ6ejgYNGojzFApFkRRFREREpIkKHZy2bdsGS0tL6Oj83y6Cg4Nx/vz5IimMiIiISNMUOjg5OTlh+vTpKrdXCQ8PL5KiiIiIiDRRoYPT06dPYWtrC2trawBZncNDQ0PFe88RERERfWoKHZymTp2KNWvWwNjYWJz3+PHjIimKiIiISBNJGgCzT58+2L59OzIyMsR5zZo1UwlNANCwYcOirY6IiIhIg0gKTkeOHEHlypUxaNAgrFq1ComJicVdFxEREZHGkRScZDIZ+vTpg0OHDqF58+YYP3485s2bx87gREREVKYUuI+Tvb097O3t8fDhQ/zyyy/Q1dXFlClTOPglERERffIK3TncxsYGmzZtwps3b7BmzRrI5XL88MMPaNGiRVHWR0RERKQxJJ2qy0uNGjWwcuVKrF69GufOncOgQYNw+vTpoqiNiIiISKMUusXpfUZGRpg3bx4UCgW2bdtWVLslIiIi0hiFbnG6du2a2rz09HRcvHgRzs7OH1QUERERkSYqdHA6fPgwVq5cibVr1yIzMxMAMHv2bCQlJWHNmjVFVR8RERGRxih0cAoMDET58uXRvn177N69GwBw8eJFfPfdd0hLSyuyAomIiIg0RaH7ONWpUwdTpkwBAPj6+gIAFAoFACA5ObkISiMiIiLSLIVucapUqRIAwNPTE56ennj06BFiY2OhVCoRGxtbVPURERERaYxCtzgNGjQI3333HapUqYK9e/diwYIFWL16NWbPnl2U9RERERFpjEIHp5YtW+LQoUPi9Pr16wFkBSoiIiKiT9EHD4BJREREVFYwOBERERFJxOBEREREJFGhg9PNmzfV7kkXFRX1wQURERERaapCB6dq1aph48aNaNmyJXbs2IEnT56gb9++RVkbERERkUYpdHCqXbs2jh07hl27duHp06fo0aMHxowZU5S1EREREWmUD+7jVK9ePSxduhS3bt3C69evi6ImIiIiIo1U6OC0e/duDBo0CFevXkVSUhIAIDQ0tMgKIyIiItI0hQ5OQ4YMQd++fbFt2zZ069YNderUQWhoKB49elSU9RERERFpjEKPHC6TyTBgwAAMGDAAAJCeno4zZ85gzpw5yMzMxKlTp4qsSCIiIiJNUOjg9L5y5cqhZ8+e6NmzJ0/ZERER0SepWAbAtLCwKI7dEhEREZUojhxOREREJBGDExEREZFEDE5EREREEjE4EREREUnE4EREREQkUZENR1BQjx8/xurVq3Hx4kW8ePFCnL9gwQLcvXtXnHZycsL06dMBALGxsVi5ciUsLCwQGxsLFxcXaGkx+xEREdHHUWLBqWHDhli7di0cHBxU5nfu3BkLFy7McRtXV1fMnz8fpqamuHPnDjZs2IDJkyd/jHKJiIiISvZUnb6+PszMzCStK5fLkZKSAlNTUwBAs2bNcPny5eIsj4iIiEhFibU45UYul2PdunV4+/YtMjIy4OLiAlNTUwQGBsLW1lZlXWNjY0RFRYlhioiIiKg4aVxwatGiBXr16gUA8Pf3x8iRI3HixAlERETA2NhYZV0TExOEhYXlGpwUCgUUCoU4HR8fX3yFExER0SdP43pWv3u7Fjs7O1haWuL169cwMzNDdHS0yrpyuRzm5ua57svd3R2VK1cWH1ZWVsVWNxEREX36Ch2cZs+ejT179ojTO3bswOzZsz+omOjoaLx+/VplXrVq1SCXy9GkSRPcv39fbf28TtO5uLggLi5OfAQHB39QfURERFS2FTo4vXr1CmFhYeJ0WFgYXr58WeD9KJVK8ed79+5h+/bt4nRGRgYCAgJgbW0NExMT6Onpia1OAQEBaNeuXZ771tXVhaGhocqDiIiIqLAK3cdp//79KtMFbW2KioqCl5cXrly5guXLl2PAgAFwdHREbGwstmzZAkEQ8OLFCyxYsAC6uroAsoYjWL16NapXr46oqCjMmTOnsOUTERERFViJdQ43NTWFs7MznJ2dVeb37t07122qVKmS6xhPRERERMVN4zqHExEREWmqAgWn8PBwlenLly9j9OjR+O6777B+/foiLYyIiIhI0xQoOI0bN078+dixY9i6dSuWLFmCffv2wcrKCkuWLCnyAomIiIg0RYH6OL17BdyZM2ewZcsWaGtrAwB69eqF//77r2irIyIiItIgBWpxkslk4s81a9YUQ1M2IyOjIimKiIiISBMVqMUpJSUFISEhEAQBFhYWiImJQZUqVQBktUYlJiYWS5FEREREmqBALU6bN29Geno6MjIy4OTkhEqVKgHIGoyyW7du6NevX7EUSURERKQJCtTiVKtWrRznf/nllzhz5kyRFERERESkqQrU4pSamoqff/4ZvXr1goeHh8qyc+fO4cCBA0VaHBEREZEmKVBwWrp0KYYNG4bjx4+jdu3a2LBhg7js66+/hpubW5EXSERERKQpChScqlWrhiZNmgAAunbtilq1auHGjRsAsq64a9asWdFXSERERKQhChScDAwMVKZ79OiBgIAAREdHAwB0dErs1ndERERExa5AwUmpVCI6OhqvX78W540fPx47d+5EWFhYkRdHREREpEkKFJycnZ2xatUquLu7i/NkMhmmTZsGb29vpKenF3mBRERERJqiQOfWwsLCcr0f3aBBg2Bvb18kRRERERFpogIFpz59+qBHjx4qt17JplQqcfr0ady8ebPIiiMiIiLSJAUKTnPmzBH7N5mZmaFBgwbiMkEQEB4eXrTVEREREWmQAgWnd2+pEhkZiSdPnkAQBKSmpqJWrVpYt25dkRdIREREpCkKNX5AcnIyAgICcOnSJQQEBEBfXx/t2rXDpEmTiro+IiIiIo1RoOA0b948MSi1bdsW/fv3x5IlS8Q+T+fOnUOnTp2KpVAiIiKiklag4BQQEIBZs2aJA10mJyfD19cXAJCZmYn169czOBEREdEnq0DBadeuXXjx4gXkcjkcHBygr6+vstzW1rZIiyMiIiLSJAUaAHP37t04ceIEXr58CWdnZyQkJKgsNzIyKsraiIiIiDRKgVqcIiMjsXDhQgBZV9gdOnQII0eOLJbCiIiIiDRNgVqc6tatK/5sZmYGPT29Ii+IiIiISFMVKDilpqaqTBsYGKhM+/j4fHhFRERERBqqQKfqZs6cCU9PT8hkMgiCgISEBGzevBlA1lV1ly5dQnJycrEUSkRERFTSChSc9uzZg+7du+e6/PTp0x9cEBEREZGmknSqLjExEQDyDE0A0LVrVwBAfHz8B5ZFREREpHkkBSc/Pz+8evVK0g4fPXoEb2/vDyiJiIiISDNJCk6Ojo64ePEiTpw4ked6O3bsgJ+fH7799tsiKY6IiIhIk0ju4zRq1ChcunQJ33//PYyNjVG3bl2YmpoiNjYWz58/R2JiIoYPH462bdsWZ71EREREJaZAncM7dOiADh06IDQ0FE+ePEFERAQ+//xz9OjRA9WrVy+uGomIiIg0QoGCUzYLCwtYWFgUdS1EREREGq1AA2ASERERlWUMTkREREQSMTgRERERScTgRERERCQRgxMRERGRRAxORERERBIxOBERERFJxOBEREREJFGhBsDMT1xcHCpXrpznOo8fP8bq1atx8eJFvHjxQpwfGxuLlStXwsLCArGxsXBxcYGWlla+y4iIiIiKm+TU4e3tjVGjRuW73ubNm9GxY8d812vYsCHWrl0LY2Njlfmurq6YNm0aJk6ciC5dumDDhg2SlhEREREVN8nB6fnz53j16hUAYP/+/Xj27FmOjxYtWkBXV1fSPvX19WFmZiZOy+VypKSkwNTUFADQrFkzXL58Od9lRERERB+D5OA0duxYAMCrV68wdepU7N27F126dFH79+jRoxAEoVDFBAYGwtbWVmWesbExoqKi8lxGRERE9DFICk5hYWHiz7Vr10bDhg3x66+/olatWjn+K7XF6X0RERFqp+5MTEwQFhaW57LcKBQKxMfHqzyIiIiICktScPLw8MB3332Hhw8f4qeffkJMTAwAQCaT5fhvYZmZmSE6Olplnlwuh7m5eZ7LcuPu7o7KlSuLDysrqw+qj4iIiMo2ScHJzc0Nhw4dgrW1NWbOnIkKFSoUSzFNmjTB/fv3VeZFR0fD1NQ0z2W5cXFxQVxcnPgIDg4ulrqJiIiobCjQcAQymQzVq1dHZmYmzp49C7lcnuO/mZmZkvepVCrFn01MTKCnp4fo6GgYGxsjICAA7dq1y3dZbnR1dQt92pCIiIjofQUKThkZGQgLC4OzszNCQ0Mxffr0HP99NwzlJioqCl5eXrhy5QqWL1+OAQMGoHbt2nB1dcXq1atRvXp1REVFYc6cOeI2eS0jIiIiKm6SgtObN28gk8kwaNAgmJubY+LEiXmuv2bNmnz3aWpqCmdnZzg7O6vMr1KlChYuXJjjNnktIyIiIipukoKTj48Ptm/fjh49eiAkJAT+/v5o3rx5jutmZGTgt99+K9IiiYiIiDSBpOA0dOhQDB06FIGBgVi2bBn27t2LgQMHokKFCipjNimVSigUCqSmpuKzzz5D3bp1i61wIiIioo+tQH2cmjRpgnXr1uHXX3/F8ePH0ahRI7Ro0aK4aiMiIiLSKIW6Q66JiQlGjRoFQ0NDcUwnIiIiok9dgVqc3tegQYOiqoOIiIhI4xWqxYmIiIioLGJwIiIiIpKIwYmIiIhIIgYnIiIiIokYnIiIiIgkYnAiIiIikojBiYiIiEgiBiciIiIiiRiciIiIiCRicCIiIiKSiMGJiIiISCIGJyIiIiKJGJyIiIiIJGJwIiIiIpKIwYmIiIhIIgYnIiIiIokYnIiIiIgkYnAiIiIikojBiYiIiEgiBiciIiIiiRiciIiIiCRicCIiIiKSiMGJiIiISCIGJyIiIiKJGJyIiIiIJGJwIiIiIpKIwYmIiIhIIgYnIiIiIokYnIiIiIgkYnAiIiIikkinpAv4lAQEx5Z0CcXqy5IugIiIqISxxYmIiIhIIgYnIiIiIokYnIiIiIgkYnAiIiIikojBiYiIiEgiXlVHVAJ4BSYRUenEFiciIiIiiTS2xWnBggW4e/euOO3k5ITp06cjNjYWK1euhIWFBWJjY+Hi4gItLeY/IiIiKn4aG5w6d+6MhQsXqs13dXXF/PnzYWpqijt37mDDhg2YPHlyCVRIREREZU2paqqRy+VISUmBqakpAKBZs2a4fPlyCVdFREREZYXGtjjJ5XKsW7cOb9++RUZGBlxcXHDv3j3Y2tqqrGdsbIyoqCgxTBEREREVF40NTi1atECvXr0AAP7+/hg5ciSGDh0KY2NjlfVMTEwQFhaWY3BSKBRQKBTidHx8fPEWTURERJ80jT1VZ2FhIf5sZ2cHS0tLpKWlITo6WmU9uVwOc3PzHPfh7u6OypUriw8rK6tirZmIiIg+bRoZnKKjo/H69WuVedWqVYONjQ3u37+vtm5up+lcXFwQFxcnPoKDg4utZiIiIvr0aWRwunfvHrZv3y5OZ2RkICAgAI0aNYKenp7Y6hQQEIB27drluh9dXV0YGhqqPIiIiIgKSyP7ODk6OiI2NhZbtmyBIAh48eIFFixYAF1dXbi6umL16tWoXr06oqKiMGfOnJIul4iIiMoIjQxOANC7d+8c51epUiXH8Z2IiIiIiptGnqojIiIi0kQMTkREREQSMTgRERERSaSxfZyIiChLQHBsSZdQbL4s6QKICojBiYiI6BPBkF38eKqOiIiISCIGJyIiIiKJGJyIiIiIJGJwIiIiIpKIwYmIiIhIIgYnIiIiIokYnIiIiIgkYnAiIiIikojBiYiIiEgiBiciIiIiiRiciIiIiCRicCIiIiKSiMGJiIiISCIGJyIiIiKJGJyIiIiIJGJwIiIiIpKIwYmIiIhIIgYnIiIiIokYnIiIiIgkYnAiIiIikojBiYiIiEgiBiciIiIiiRiciIiIiCRicCIiIiKSiMGJiIiISCIGJyIiIiKJGJyIiIiIJGJwIiIiIpKIwYmIiIhIIgYnIiIiIokYnIiIiIgkYnAiIiIikojBiYiIiEgiBiciIiIiiRiciIiIiCRicCIiIiKSiMGJiIiISCIGJyIiIiKJdEq6gMKIjY3FypUrYWFhgdjYWLi4uEBLixmQiIiIilepTBuurq6YNm0aJk6ciC5dumDDhg0lXRIRERGVAaUuOMnlcqSkpMDU1BQA0KxZM1y+fLmEqyIiIqKyoNQFp8DAQNja2qrMMzY2RlRUVAlVRERERGVFqevjFBERAWNjY5V5JiYmCAsLE1uhsikUCigUCnE6Li4OABAfH18stSWmZhTLfjVFcb1vZRGPFSqIT/l44bFStHisfNi+BUHId91SF5zMzMzw6NEjlXlyuRzm5uZq67q7u8PNzU1tvpWVVbHV90lzrVzSFVBpwWOFpOKxQlJ9hGMlISEBlSvn/TylLjg1adIE+/fvV5kXHR2t1toEAC4uLpg+fbo4rVQqER0dDRMTE8hksmKvtTjFx8fDysoKwcHBMDQ0LOlySIPxWCGpeKxQQXxKx4sgCEhISIClpWW+65a64GRiYgI9PT1ER0fD2NgYAQEBaNeuXY7r6urqQldXV2WekZHRR6jy4zE0NCz1Byx9HDxWSCoeK1QQn8rxkl9LU7ZSF5yArOEIVq9ejerVqyMqKgpz5swp6ZKIiIioDCiVwalKlSpYuHBhSZdBREREZUypG46Asujq6uLXX39VOxVJ9D4eKyQVjxUqiLJ6vMgEKdfeERERERFbnIiIiIikYnAiIiIikojBiYiIiEgiBiciIiIiiRicPiFpaWni7WiUSiWUSmUJV0SaKjMzEwDEY+T169fw8vIqyZJIQ/FYIVLF4FTKZd9GBgBevHgh3ptPS0sLWlpZv97sDz6i9PR0nD59GidOnACQdZzcv38fc+fOxc8//4xXr16VbIGkMXisUGFlZmaKN8tNTk5GYmIi0tLSSriqosPgVMrt378fXl5eSEtLg7W1NerWrQs/Pz8AwOPHj/Hjjz/i/v37JVwllSRBEMTWgnLlyiE1NRXBwcFYtmwZunfvDjc3N3z77be4dOkSateuLenu4PRp4rFChRUfH4+5c+dCLpdDW1sbMpkMkZGRmDhxIr788kucPXu2pEssMgxOpZyRkREyMjLEVqeWLVuKB6i3tzfq16+PL7/8sgQrpJImk8mgpaWF5ORkHDlyBBs2bMD//vc/6OjoYNOmTdi7dy+0tbUxZcoUxMXFlfobYFPh8VihglAqleIZjTdv3qBGjRrQ09PDjh07MGTIEIwePRo1atTAwYMH0aNHjxKutuiUyluulFVJSUkwMDAQ/yrU1taGra0tQkJCEBISAnNzc7Rr1w7nz59HVFQUUlJSMGjQoJIumz6y2NhYeHh4oEGDBujTpw9OnToFLy8vPHr0CG3btsX8+fNx6dIl1K9fH1u3bsXFixfRqlUrhIWF4e3bt6hcuTIEQeCXYhmRmJiIihUrAgD++ecfeHp68lghNTn9nrO7gwDA7du34e7ujhMnTqBp06b48ccf0a5du49d5kfB4FQKJCcnw93dHW3atEHXrl0hk8mgra0NAKhRowaCgoJgaGgIW1tbGBsbw8DAALNnz8aSJUtgbm5ewtXTxxIUFIQDBw7gzJkziI2NxYwZM3D48GFs27YNkyZNwh9//IHXr1/j7Nmz2LlzJ96+fYvIyEj8+OOP6N+/P7y8vHDixAlYW1vzy7AMeP9z5dChQ9i2bRsmT57MY4XUZP+Oo6OjYWxsDCArLB0/fhx+fn4oX748+vfvDxcXFyQmJoqnfJOSkhAZGYnatWuXVOlFjqfqNEh6errYHykqKgrOzs5IS0vDzZs38fz5czg4OIjrvnnzBr/++iv69OmDf/75B48fP0ZoaCgA4JtvvoGZmRnMzc15Zd0nLjY2Fhs3bkT//v0xZMgQyGQybNmyBYGBgXB2dkbfvn1x8uRJ6OnpoW3btpg+fTp8fHzw66+/YvPmzXB1dUWLFi3g4eEBPT09PHjwAIDqX5JUukn9XPn2229x6tQpHiukJrsv26ZNm7Blyxakp6cDALZs2QIjIyMcPnwYc+fOhZ2dHapWrYqTJ0/i77//BgCcOXMG/fv3x507d0qs/qLGI16DxMbGYuHChZDL5TA1NUV8fDxevnyJp0+fYvLkyahQoQIiIiLg7+8Pb29vJCUl4Y8//sDt27dRs2ZNhISEAABat26NhIQEhIeHQ0tLC4IgID09HcnJySX8CqkoKJVKHDt2DCNGjECXLl1w7949TJgwAX369MGMGTNQt25d+Pr64tSpU+JficnJyahXrx6OHDmCv//+Gw8fPoSPjw9++OEHjBs3Du7u7rCyskK1atVw+fJl8Xmo9JPyuRIWFoanT58CyDp1x2OlbMv+fgGAGTNmwMvLC6mpqbhw4QJatmyJcuXKAQD++usvzJw5EwYGBtDT0xNbouzs7HD9+nV07twZe/bswcCBA9GwYcMSez1FjafqNMiNGzdgbW0NExMTAMD48eMxefJkDB8+HI0bNwYAPHz4ELdv38asWbMwbNgwcVsdHR28efMG8fHxMDQ0xBdffAFvb28MHDgQMpkMcXFxePr0Kdq2bVsir42Kxvz583Hu3DnY2NjA2dkZt27dQsuWLdGhQweEhITgp59+Qnp6Oq5fvw5jY2N0794dANCtWzccPnwYwcHBCAwMxPHjxxEYGIivvvoKK1euxNOnT+Hr64tatWrh4sWLsLe35+mXT4SUz5VHjx7h9u3bsLW1RY8ePXD06FEeK2XYsWPHEBkZif79+6NmzZoICgpChQoV0K1bN9jb2yMjIwO3bt1CmzZtxFO1p06dQnJyMv7++29ER0dDS0sLgwYNwogRI5CWloby5csDyLmvVGnDFicNkJGRASDrw+vJkyfi/NjYWCQlJWH48OGoVKkSAODSpUuws7MDkDVWRvYVDY0aNUJMTAzCw8MBAI6Ojrh16xaArL8GTU1NceDAAQQFBX2010VFr2fPnjh16hS2bt0KJycnDBgwAGvWrMGsWbOwfv16XL16Fd988w3u37+PJk2aiK0IAPDll19i586dkMlkuHDhApYsWYLExERMmzYNY8eOxYULF9C7d288e/YMSUlJpf7DrawryOeKt7c3mjZtKq7DY6XsCgkJQUxMDLp06QIA+PHHH3H+/Hn4+PigY8eOAICwsDCcPHkSwcHB4u8+LS0NISEhmDlzJs6ePQt3d3fExsZCEAScO3cO06dPx+3btz+J1kkGJw2go5PV8JeamorExETEx8cjOTkZTZs2Rb169eDh4YHZs2eLB7KtrS0iIyOhra0tHoS2trZ48+YNQkNDIQgCGjduDKVSKZ6uA4Dg4GDcvHmzZF4kFYmWLVvC1NQUN2/exKJFizB58mTcu3cPRkZGOHnyJEaPHo3PP/8cQFZzefbghQDQqVMnBAcHo23btvjjjz8wa9Ys+Pv7o0WLFrh16xZ27NgBKysr1KlTBz4+PgDAcXpKscJ8rmT/4eXk5MRjpYxSKBS4du0aFAoFgKyLTqKjo+Hg4IAaNWoAAC5evIiqVavCyspK/A5yc3PDtm3bsGnTJoSEhMDCwgJv375FWFgYunfvjkmTJmH9+vVYtGiReJyVVgxOGuLs2bMYO3Ysvv76a1y5cgWZmZn4/PPPsW7dOhgYGKBVq1Y4evQofvjhB0yfPh3t2rVDenq6eK45PDwcSqUSr169Esd0sra2xuLFizF9+nQ4OTnBysqKp+o+ERs2bEBgYCA8PT2xePFiWFlZoWrVqqhRowbu3r0LIKsVMrsDLwA0aNAA2traOHz4MNq1a4dLly7BxcUFhoaGAIDIyEisWbMG586dg56eHgCwJaGUK+jnir29PdLT02FjY8NjpYzJDr53795F48aNxUAsk8kwbtw4HDlyBIsXL0aPHj1w7do1dOjQAWFhYeLvPTMzE2/evIGuri6qVKkCMzMz1KlTB9evXwcAfPbZZ3Bzc0OjRo1KfX9bBicN0bJlS1haWqJ58+bYtGkTKlWqBB0dHZw7dw6pqanIzMzE6NGjMWbMGNSsWRPHjx9HuXLlkJaWhm3btmHnzp1iS9SbN28AAP369YORkRGaNWuGEydOYO3atahVq1YJv1L6ENkfbqNGjYK1tTW0tbVhb28Pf39/AEDTpk1x/PhxODo64uDBg8jIyICvr6+4ffPmzZGUlIRvv/0WQFa48vT0RGxsrLivI0eOwMnJ6eO/OCpyhf1cAXislDUymQwvX76EkZERFi1aJAae8uXLw9nZGf3790ft2rWxa9curFu3DseOHUO9evXEAZe1tbVx9epVODk5QV9fH1paWujUqRMCAgLE56hduzb69u2LOnXqlMRLLDoCaRxfX18hMTFROH36tNC6dWuhUaNGwuTJkwU/Pz+1db29vYWJEyeK07/++qtw4cIFITMzU23dzMxMQalUFmvt9PEMHjxYCA8PF4KDg4Vdu3aJv/ODBw8K3t7egiAIwq1bt4Q5c+aI24SGhgq9e/dW2c+ZM2cEhUKhMo/HyadH6udK9nHEY6Xsefd36eDgILx69UoQBEEICAgQ3N3dhR07dgjjx48XOnXqJEybNk24evWquH5mZqawbt06IS4uThAEQYiJiRFmzJghLF68OMfvo9KMV9VpGB8fH5w4cQIuLi744osvxCsShg0bBiMjIwBZTaLZt0YwMTHBxIkTxe3Hjx8PCwsLlX0K//8qBo638mnI/n3a2NggMDAQlSpVQkZGhvj7/e6778R1//33X7x9+1a8qsXc3ByWlpZ4/vw56tWrByBr3K/38bTLp0Xq50r2zcEFQeCxUgbJZDJkZmZCW1sbmzdvhr6+Pm7evInff/8dd+/eRcuWLTFixAh8/fXXatveunULLVq0EE/nxsTEoHz58nBwcPjkvntkgsAefZri7NmzcHV1xYQJE9CrVy9UrlwZQFZ/lnr16sHBwQG6urolXCVpivv372P58uXo2rUrnJycxFHifX19sX//fty7dw9ffPEFoqOjMWjQIPTt2xcAVC4Npk/fh3yu8Fgpm/z9/eHp6YmrV6/C1NQUPXv2RFRUFBwcHNCkSRMAqn/AA1n9bCtWrAgDA4OSLP2jYHDSIMJ741tkZGRAR0cHnp6eCAkJwcCBA1GlSpUSrJA0zY8//ogff/wRVatWxdatW+Hj4wMDAwN069YNvXv3homJCaKjoyGXy8VWAypb+LlCBZGYmIj+/fvDyckJQ4cOhaWlJYCsAP769Wt0795d7axGWcNTdRok+8NNqVRCS0tLvB9d48aNERMTI3baJMr+MrS2tka/fv1gbW2Nr776Cn/++Sc+++wzlfWMjIzEEX2p7OHnChVExYoVcfr0aXE6+9RdgwYNcPv2bcTExJT54MQWJ6JS7OXLl+jduzcCAwNV5md/2BERFYZSqYRMJlNprdy+fTscHR0/qRv2FgaDUynyfpM7lW3Zx8Pp06fh4OAAPT09tQ86ovzwc4WoYBiciIiIiCT6tK4RJCIiIipGDE5EREREEjE4EREREUnE4EREREQkEYMTERERkUQMTkREREQSMTgRERERScTgRERERCQRgxMRERGRRAxORERERBL9P+SKUCelcj/YAAAAAElFTkSuQmCC",
      "text/plain": [
       "<Figure size 600x400 with 1 Axes>"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    }
   ],
   "source": [
    "# (2) 可视化：区域 PM25 & 人均 PM25 对比 (双柱简单示意)\n",
    "\n",
    "# 设置中文显示和负号正常显示\n",
    "plt.rcParams['font.sans-serif'] = ['AR PL UMing CN']\n",
    "plt.rcParams['axes.unicode_minus'] = False\n",
    "\n",
    "plt.figure(figsize=(6,4))\n",
    "x = region_summary.index\n",
    "plt.bar(x, region_summary['PM25'], width=0.4, alpha=0.7, label='PM25 平均')\n",
    "\n",
    "plt.bar(x, region_summary['PM25_per_million']*100, width=0.4, alpha=0.7, label='人均 PM25 (×100示意)')\n",
    "plt.title('区域 PM25 与人均指标对比 (示意)')\n",
    "plt.ylabel('值 (PM25: μg/m³)')\n",
    "plt.xticks(rotation=20)\n",
    "plt.legend()\n",
    "plt.tight_layout()\n",
    "plt.show()"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "1c2d97d1",
   "metadata": {},
   "source": [
    "> 说明：人均指标乘以 100 只是为了放在同一尺度上做演示，真实比较应规范化或用双轴 (本课程暂不展开)。"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "6f34f854",
   "metadata": {},
   "source": [
    "### 示例结论草稿 (请自行再加工)\n",
    "华南与西南区域 PM25 平均值位于前列，同时人均 PM25 示意指标（缩放后）也较高，说明这些区域内人口集中城市的空气质量治理仍需关注。需要注意样本仅 5 天，且指标简化，不代表长期趋势。"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "85f6b4f9",
   "metadata": {},
   "source": [
    "### 小练习 5\n",
    "1. 选取 PM_diff 最大的前 5 个城市，看看是否集中在某区域。\n",
    "2. 重新绘制一张图：只展示 top5 城市的 PM25 平均值。\n",
    "3. 写 2 句说明该图表达了什么。"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "ca2c7877",
   "metadata": {},
   "source": [
    "## 8. 小结\n",
    "今日串联：\n",
    "1. merge：按 key（city）横向引入维度信息；how 影响保留范围（inner/left 等）。\n",
    "2. concat：纵向追加同结构数据，注意列自动对齐与缺失填补。\n",
    "3. groupby + agg：支持多指标统计，形成结构化摘要；链式：排序→筛选→可视化。\n",
    "4. 衍生指标：基于领域/问题设计（人均、差值、比率）→ 先说明计算假设。\n",
    "5. Mini Project：强调“问题驱动”而非“函数堆砌”。\n",
    "\n",
    "实务建议：\n",
    "- 每做一次合并后立即 shape / isnull().sum() 自检。\n",
    "- 聚合输出命名清晰（mean / max），必要时 reset_index 方便后续处理。\n",
    "- 指标缩放或示意需注明倍数，保持透明与可复现。\n",
    "\n",
    "衔接：明天的入门模型会利用今天的“整洁 + 衍生”思路构造特征。"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 13,
   "id": "5da8fa48",
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "_merge\n",
      "both          30\n",
      "left_only      0\n",
      "right_only     0\n",
      "Name: count, dtype: int64\n",
      "        PM25_mean  PM25_max  PM10_mean\n",
      "region                                \n",
      "华东           42.0        47       56.0\n",
      "华北           62.2        66       82.2\n",
      "华南           42.2        45       54.4\n",
      "西南           57.3        62       74.3\n",
      "  city       date  PM25  PM25_rank_in_city\n",
      "0   广州 2025-09-01    42                3.0\n",
      "1   广州 2025-09-02    41                4.0\n",
      "2   广州 2025-09-03    39                5.0\n",
      "3   广州 2025-09-04    45                1.0\n",
      "4   广州 2025-09-05    44                2.0\n",
      "date    2025-09-01  2025-09-02  2025-09-03  2025-09-04  2025-09-05\n",
      "region                                                            \n",
      "华东            41.5        43.5        41.0        41.5        42.5\n",
      "华北            60.0        65.0        58.0        62.0        66.0\n",
      "华南            42.0        41.0        39.0        45.0        44.0\n",
      "西南            56.5        55.5        58.0        59.5        57.0\n",
      "过滤后城市数: 6\n"
     ]
    }
   ],
   "source": [
    "# 进阶合并与分组示例\n",
    "import pandas as pd\n",
    "air = pd.read_csv('../data/air_quality_timeseries.csv', parse_dates=['date'])\n",
    "info = pd.read_csv('../data/city_info.csv')\n",
    "\n",
    "# 1. 合并并带来源指示\n",
    "m = pd.merge(air, info, on='city', how='left', indicator=True)\n",
    "print(m['_merge'].value_counts())\n",
    "\n",
    "# 2. 多级 agg 扁平化\n",
    "agg_df = m.groupby('region').agg({'PM25':['mean','max'], 'PM10':'mean'})\n",
    "agg_df.columns = ['_'.join(col) for col in agg_df.columns]\n",
    "print(agg_df.head())\n",
    "\n",
    "# 3. 分组排名\n",
    "m['PM25_rank_in_city'] = m.groupby('city')['PM25'].rank(ascending=False, method='dense')\n",
    "print(m[['city','date','PM25','PM25_rank_in_city']].head())\n",
    "\n",
    "# 4. 透视表雏形\n",
    "pivot = m.pivot_table(values='PM25', index='region', columns='date', aggfunc='mean')\n",
    "print(pivot.head())\n",
    "\n",
    "# 5. 分组过滤：只保留记录数>=5 的城市\n",
    "filtered = m.groupby('city').filter(lambda d: len(d) >= 5)\n",
    "print('过滤后城市数:', filtered['city'].nunique())"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "37f38ca8",
   "metadata": {},
   "source": [
    "### 合并 / 分组进阶技巧\n",
    "1. 验证连接是否丢数据：合并前后行数 + 是否出现空值。\n",
    "2. 指定左右键不同名：pd.merge(a,b,left_on='city', right_on='city_name')。\n",
    "3. 指定多键：on=['date','city']，确保业务唯一。\n",
    "4. 指标扁平化：multi_agg.columns = ['_'.join(col)] 去除多级列。\n",
    "5. 分组过滤：groupby('city').filter(lambda d: len(d)>3)。\n",
    "6. 分组排名：merged['pm25_rank_city'] = merged.groupby('city')['PM25'].rank(ascending=False)。\n",
    "7. 高效统计唯一个数：nunique(dropna=False)。\n",
    "8. 交叉表/透视需求起点：merged.pivot_table(values='PM25', index='region', columns='date', aggfunc='mean') (了解)。\n",
    "9. 丢失键分析：使用 indicator=True 查看合并来源。\n",
    "10. 内存友好：只选用必要列再合并，减少宽表。"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "7610bfd6",
   "metadata": {},
   "source": [
    "## 9. 课后作业要求\n",
    "提交 *mini_project_day6.ipynb*：\n",
    "1. 标题：你的项目主题\n",
    "2. 至少 2 个明确分析问题 (问题句式)\n",
    "3. 数据步骤：读取→合并→新增列→分组→图表 (每步有 Markdown 注释)\n",
    "4. 结果：1~2 张图 + 关键表\n",
    "5. 结论：100~150 字（含发现 + 局限 + 后续可以加的数据）\n",
    "6. 伦理与来源：写出数据来源与是否做过简化/假设\n",
    "(可选) 7. 将其中一段重复统计封装成函数 (如：传入 region 返回其 PM25 均值)。"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "49f9d899",
   "metadata": {},
   "source": [
    "---\n",
    "📌 提示：明天 Day 7 只做最小模型体验，保持 Notebook 干净结构利于后续复用。"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "base",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.12.7"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 5
}
